For the past decade, we, the builders of the modern data stack, have sold a story of speed and democratization. We promised that with the right cloud warehouse, the right BI tool, and the right talent, insights would flow like water. It’s time for a confession: for most organizations, that promise remains unfulfilled.
We find ourselves in an era of great stagnation, a paradox of immense investment and declining productivity. Business leaders are not just frustrated; they are, as one recent analysis put it, “fundamentally disillusioned with the entire analytics process”.1 Despite pouring billions into cutting-edge technology, the time it takes to get a trusted answer to a critical business question still stretches from weeks into months.2 Data and business analytics teams are overwhelmed, and this chronic delay in decision-making leads directly to missed revenue and strategic drift.2 This isn’t a failure of ambition or a lack of powerful tools. It is a fundamental, architectural flaw in how we’ve built our data ecosystems. We have constructed magnificent engines, powerful data warehouses from Snowflake, Databricks, and others, but we have consistently failed to connect them to the chassis of the business.
The arrival of Generative AI has not solved this problem; it has dangerously amplified it. In the rush to deploy AI, enterprises are discovering a painful truth: pouring a powerful LLM onto a broken data foundation doesn’t create an intelligent enterprise. It creates a faster, more confident, and more prolific generator of incorrect answers. The solution lies not in another dashboard or a more sophisticated LLM, but in fixing the missing layer of context and trust that sits between our technology and our business reality. This post will argue that the manual, human-powered methods for building this layer are obsolete, and that intelligent automation is the only viable path forward.
The Missing Link: Why Your LLM Needs a PhD in Your Business
The missing piece in this puzzle has a name, though it’s often shrouded in technical jargon: the semantic layer. Let’s demystify it. A semantic layer is not just a piece of technology; it’s a business necessity. It is the “Rosetta Stone” for your company’s data, translating the cryptic language of database schemas into the clear, unambiguous language of business.3 It is the single, authoritative place that defines “what things mean, and how we calculate anything”.4 When your marketing, sales, and finance teams all report on “customer churn,” the semantic layer is what ensures they are all using the exact same definition, calculation, and underlying data. It enforces consistency and builds trust.
For years, the primary consumer of the semantic layer was a human analyst. Today, its most important consumer is the AI itself. This is the key to understanding the AI context gap. Asking a general-purpose LLM a specific business question without a semantic layer is, as one expert aptly described, like “walking up to a complete stranger on the street and blurting out with no introduction: how many students were enrolled in fall 2024?”.5 The LLM is a genius at writing code; it can generate a technically perfect SQL query in milliseconds. But it has absolutely zero “specialized insider knowledge” of your business.5 It doesn’t know that your company defines “active user” based on a login within the last 14 days, or that you exclude fraudulent transactions using a specific set of criteria.
This is precisely why so many Text-to-SQL initiatives, despite their promise, fail to gain the trust of data professionals.6 The risk of “hallucinated answers and actions” is simply too high when the AI lacks the encoded domain knowledge of the business.7 A semantic layer closes this gap. It acts as a detailed “instruction manual” for the LLM, augmenting its general capabilities with your organization’s specific business logic, metrics, KPIs, and data relationships.5 It provides the structured context needed to transform a vague natural language question into a precise, accurate, and reliable answer.7
This isn’t just an incremental improvement; it’s a foundational requirement for the future that we at investors like Team8 envision. The next transformative wave of AI will be “autonomous AI agents” capable of open-ended decision-making and complex task execution.9 These agents cannot possibly function, let alone be trusted, if they do not have a deep, codified understanding of the business entities, metrics, and relationships they are meant to operate on. The semantic layer is the essential infrastructure for this AI-driven future.
The Old Way is a Dead End: Learning from the Failure of Data Catalogs
At this point, many seasoned data leaders will say, “We’ve tried this before.” They are thinking of the massive, multi-year data cataloging and governance projects that promised a single source of truth but delivered a mountain of spreadsheets and a sea of frustration. They are right to be skeptical. The old way is a dead end.
The fundamental flaw in traditional approaches to building semantic layers and data catalogs is their complete reliance on a manual, human-in-the-loop process. As I’ve written before, these projects fail because they depend on “Data Stewards (humans) to continuously catalog the data”.6 This is a Sisyphean task. It involves endless interviews and workshops to extract institutional knowledge from the minds of busy analysts, who then manually document definitions in a wiki or a dedicated cataloging tool. The process is painfully slow, prohibitively expensive, and fundamentally unscalable.
Worse, the output is a static snapshot, a museum of your data at a single point in time. It becomes obsolete the moment it’s published. As new dashboards are built, tables are modified, and business logic evolves, the catalog begins to drift from the ground truth. Trust erodes, adoption plummets, and the project collapses under its own weight. The chasm between this old, manual approach and the new, automated paradigm is so vast that it’s worth laying out side-by-side.
Feature | The Manual Curation Approach (The Museum) | The Automated, Living Model (The Organism) |
---|---|---|
Process | Manual interviews, spreadsheets, and human tagging. Relies on institutional memory. | AI-driven analysis of system metadata from BI tools, query logs, and data warehouses. |
Speed | Months or years to create an initial, partial version. | Days or weeks to generate a comprehensive, initial model. |
Maintenance | A constant, Sisyphean task of manual updates. Always out of date. | Continuously and automatically updates as the underlying systems change. |
Scalability | A human bottleneck. The process actively slows down as the data stack grows. | Scales dynamically and effortlessly with the data stack. |
Trust | High initially, but erodes over time as it drifts from the ground truth. | Low initially (requires validation), but grows into the single source of truth as it continuously reflects reality. |
Primary Goal | To create a static document for humans to read. | To create a dynamic, machine-readable model for both humans and AI to use. |
This table illustrates why incremental improvements to the old model are futile. We don’t need a better way to manually curate a data museum; we need to build a living, self-sustaining data organism.
The Breakthrough at Solid: Automating Trust from the Top Down
At Solid, we believe the only way to solve this problem is to invert it. Instead of asking overworked humans to document a complex system, we built a system that documents itself. Our core innovation is an AI-powered Metadata Activation Engine designed to automatically generate and maintain a living semantic layer.2
The key to our approach is that we work “top-down.” We don’t start from the bottom, in the raw, context-poor tables of a data warehouse like Snowflake or Databricks. That is where traditional approaches get lost. Instead, we start where the business context is richest: in the business-facing layers of the data stack.2 We connect to your BI tools (like Tableau, PowerBI, or Looker), your query logs, and even unstructured knowledge sources like Jira tickets. This is where the business logic already lives, embedded in the names of dashboards, the calculations of metrics, and the questions stakeholders are asking.
Our engine analyzes the metadata from these high-context sources to understand how data is actually used by the business. It automatically identifies business terms, entities, and KPIs. Then, it traces their lineage and maps out their interconnections, dimensions, and attributes all the way back to the physical data assets in the warehouse.2 The result is not a static document, but a living, breathing, visual map of your entire data stack, organized by business topics. It is a semantic model that continuously updates itself, always reflecting the ground truth of your organization’s data universe as it evolves.
This automated foundation is what allows Solid to deliver on the promise of analyst productivity. By providing a trusted map and automatically resurfacing relevant, high-quality past work, we eliminate the redundant, soul-crushing data discovery that consumes up to 90% of an analyst’s time.2 We empower them to stop being data janitors and start being the strategic partners the business needs them to be, focusing on “strategic analyses that drive business impact”.10
An Invitation: See the Blueprint of Your Data Universe, For Free
We can talk about this new paradigm, but we believe in showing, not just telling. The foundational output of Solid’s engine, the very first thing it produces upon connecting to your systems, is a complete, high-accuracy documentation and semantic model of your data assets.
For a limited time, we are inviting a select group of organizations to experience this firsthand. We are offering a free, full documentation of your data warehouse and BI assets.6 There is no cost and no obligation.
This is not a typical software trial. This is about giving you the architectural blueprint of your own data universe. With this map, which is generated in days, you can immediately:
- Gain a complete, AI-generated documentation of your key data assets in systems like Snowflake or Databricks and their corresponding BI artifacts.
- See a visual map of your data stack, clustered by business topics, highlighting the hidden connections and critical dependencies between your assets.2
- Identify unused or underutilized data assets, creating immediate opportunities for cost optimization and improving your FinOps efficiency.12
- Spot and resolve inconsistencies in how core metrics are defined and used across different teams and tools.
- Receive an output that can be used by both humans and machines—a human-readable guide for your teams and a machine-readable model to serve as the foundation for your AI initiatives.11
The journey to a truly data-driven, AI-enabled enterprise begins with a single, accurate map. Let us draw yours.
Apply here to to get your full, free documentation for humans and AI.
Beyond Efficiency: Building the Foundational Layer for the Next Era of Enterprise AI
This initiative is about more than just documentation. It’s about taking the first, critical step toward building the trusted, foundational data layer that the entire future of enterprise AI depends on. This aligns perfectly with the broader industry push toward “AI Maturity,” which requires moving beyond experiments and establishing the robust infrastructure needed for widespread, reliable deployment.9
Our goal at Solid is to help organizations move beyond the current “unsystematic and inconsistent” state of analytics and build a future where insights are fast, reliable, and scalable.1 This is how we transform analysts from overwhelmed report builders into proactive strategic partners. This is how we finally deliver on the decade-old promise of the modern data stack.
The AI revolution is here. But it will be built on a foundation of trust, context, and consistency—or it will not be built at all. The era of manual curation is over. The era of the living, self-documenting data stack has begun. We invite you to build it with us.
Take the first step. See the blueprint of your data universe. Sign up today.