Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn more
Snowflake is ready to deploy powerful language models for complex data work. Today, the company announced the launch of Cortex Analyst, an all-new agentic AI system for self-service analytics, in public preview.
First of all announcement during the company Data Cloud Summit in JuneCortex Analyst is a fully managed service that provides businesses with a conversational interface to communicate with their data. All users have to do is ask business questions in plain English and the agentic AI system takes care of the rest, from converting prompts to SQL and querying the data to running checks and providing the required answers.
Baris Gultekin, Snowflake’s head of AI, tells VentureBeat that the offering uses a combination of multiple large language model (LLM) agents working in tandem to ensure that insights are delivered with about 90% accuracy. He says that’s significantly better than the accuracy of existing models. LLM Based Text to SQL Conversion Offeringsincluding Databricks, and can easily accelerate analytics workflows, providing business users with instant access to the information they need to make critical decisions.
Simplify analysis with Cortex Analyst
Even as businesses continue to focus on AI-driven generation and prediction, data analytics continues to play a transformative role in business success. Organizations are extracting valuable insights from structured historical data—organized in tabular form—to make decisions in areas such as marketing and sales.
The problem, however, is that the entire analytics ecosystem today is largely based on business intelligence (BI) dashboards that use charts, graphs, and maps to visualize data and provide insights. This approach works well, but can also be quite inflexible at times, with users struggling to drill down into specific metrics and relying on often-overworked analysts to provide additional insights.
“When you have a dashboard and you see a problem, you immediately ask three different questions to understand what’s going on. When you ask those questions, an analyst comes in, does the analysis, and provides the answer within a week or so. But then you may have other follow-up questions, which can leave the analysis loop open and slow down the decision-making process,” Gultekin said.
To fill this gap, many have begun to explore the potential of large language models which have been very effective at extracting insights from unstructured data (think long PDFs). The idea was to pass a schema of raw structured data through the models so they could feed a text-to-SQL conversational experience, allowing users to instantly communicate with their data and ask relevant business questions.
However, as these LLM-based offerings emerged, Snowflake found a major problem: low accuracy. According to the company’s internal tests representative of real-world use cases, when using leading models like GPT-4o directly, the accuracy of analytics insights was around 51%, while dedicated text-to-SQL sections, including Databricks’ Genie, led to an accuracy of 79%.
“When you’re asking business questions, accuracy is the most important thing. Fifty-one percent accuracy is not acceptable. We were able to nearly double that to about 90 percent by leveraging a series of large language models working closely together (for Cortex Analyst),” Gultekin noted.
When integrated into an enterprise application, Cortex Analyst receives business queries in natural language and passes them to LLM agents at different levels to provide accurate, non-hallucinatory answers based on enterprise data in the Snowflake Data Cloud. These agents handle tasks ranging from analyzing the intent of the question and determining whether it is answerable to generating and executing the SQL query from it and verifying the correctness of the answer before returning it to the user.
“We’ve built systems that understand whether the question is something that can be answered or whether it’s ambiguous and can’t be answered with accessible data. If the question is ambiguous, we ask the user to rephrase it and provide suggestions. Only once we know that the question can be answered by the extended language model do we pass it to a series of LLMs, agent models that generate SQL, determine if that SQL is correct, correct the incorrect SQL, and then execute that SQL to provide the answer,” Gultekin says.
The AI chief didn’t share the exact specifics of the models powering Cortex Analyst, but Snowflake confirmed that it uses a combination of its own arctic model as well as those of Mistral and Meta.
How exactly does it work?
To ensure that the LLM agents behind Cortex Analyst understand the full schema of a user’s data structure and provide accurate and contextual answers, the company asks customers to provide semantic descriptions of their data assets during the setup phase. This solves a major problem associated with raw schemas and allows models to capture the intent of the question, including the user’s vocabulary and specific jargon.
“In real-world applications, you have tens of thousands of tables and hundreds of thousands of columns with strange names. For example, ‘Rev 1 and Rev 2’ might be iterations of what might mean revenue. Our customers can specify these metrics and their meaning in semantic descriptions, which allows the system to use them to provide answers,” Gultekin added.
The company now offers access to Cortex Analyst as a REST API that can be integrated into any application, giving developers the ability to customize how and where their business users leverage the service and interact with the results. It is also possible to use Streamlit to build purpose-built applications using Cortex Analyst as the core engine.
During the private preview, around 40-50 companies, including pharmaceutical giant Bayer, deployed Cortex Analyst to communicate with their data and accelerate analytical workflows. The public preview is expected to increase that number, especially as companies continue to focus on adopting LLMs without breaking the bank. The service will give companies the power of LLMs for analytics, without all the implementation hassle and overhead.
Snowflake also confirmed that it will be getting more features in the coming days, including support for multi-turn conversations for an interactive experience and more complex tables and schemas.