The VentureBeat AI Impact Tour stopped in San Francisco, and this time the conversations dove into real-world applications of generative AI and what it actually takes to scale a generative AI initiative. Matt Marshall, CEO of VentureBeat, rose above the hype to get serious about deploying AI with keynote speaker Ed Anuff, Chief Product Officer of DataStax, along with Nicole Kaufman, Chief Transformation Officer at Genesis Health and Tisson Matthew , CEO of Skypoint.
The conversation is crucial as companies move out of the experimental ideation phase that characterized the last year of enthusiasm for generative AI. They go beyond testing the possibilities of ChatGPT and natural language interfaces in general, and begin to answer the important question: how can we harness this power and integrate it with our own business-critical data – and then how to present it. in production ?
“What we are seeing is the emergence of an AI maturity model that shows companies moving away from one-off projects that are primarily focused on achieving a few quick wins and educating the business on its potential ” said Anuff. “You see critical AI business initiatives where there’s usually a business champion asking: How do we deploy AI in a critical, high-visibility, high-impact way? It usually takes a little longer, but the results are there and that’s where the transformative aspects will happen.
Gen AI is useful for a wide number of use cases, ranging from back office to front office, public website and mobile. And organizations may still use terms like “chatbots” or “conversational interface,” but ultimately what they’re building is a knowledge application – determining how to retrieve knowledge interactively in a framework suited to the situation. The question is whether to do it in-house or use one of the growing number of commercially available products.
Pre-production considerations
For applications like customer support or financial analysis, many companies want to leverage AI generation to build an application that can generate results from internal data or reports, Anuff said.
“These types of applications, depending on how much data you have and what the custom interface is, you can use something that’s commercially available,” he said. “There are solutions from Amazon and others that simply provide you with a way to upload a bunch of documents and have the chatbot respond. And it’s a very good way to have an original experience in a very short time.
But as you move from the back office, small-team apps meant to address use cases critical to your core business, especially those that are outward-facing and out-of-the-box, don’t work. more, such as in resource-intensive use cases. storage of personalized data. Anuff highlighted healthcare applications, connecting the gen AI interface to data sources so that it can respond in real time as information changes or is updated – such as patient readings in a hospital, etc. Anuff also spoke about AI agents deployed by many financial institutions in the Asia-Pacific region, with chat-based financial planning directly accessible from financial statements.
“It’s not something you think outside the box,” he said. “This is a custom-built AI RAG (augmented recovery generation) application that leverages your core data assets. if you’re HomeDepot or Best Buy, you don’t build your website on WIX. You have thousands of web engineers creating a tailored experience because it is at the heart of your brand and your core business activities.
Calculate readiness and cost
As companies move beyond the ideation stage, they begin to run into two main problems, Anuff said.
“The first is relevance, which for many of us dealing with data is sort of a new metric and a new metric, which shows how appropriate are these responses? » he explained. “A lot of it is relevance and retrieval issues, inefficiencies or just retrieving the wrong content. And many companies face this problem. This ends up forcing you to rethink your entire data architecture in many cases.
And that largely impacts the second element, which is cost. It’s already expensive to find a way to get relevant and clear results. You then need to determine how much the additional production will cost.
“When we talk to people, it’s a really good way to realistically calibrate how close they are to production,” he explained. “If people are still at the stage where they’re struggling to find relevance, we know they’ve moved past the initial architecture elements. On the other hand, production costs tend to go hand in hand. These are the two big bookends.
Hallucinations, data and the importance of RAG
The term “hallucinations” is used whenever an answer seems wrong. It’s a good colloquial term, but not all bugs and irrelevance of an AI system are, in fact, hallucinations – they might actually just be an error in the whole thing training. Hallucinations occur when an LLM uses their training data as a launching pad to start making hypotheses and speculations, and the answers start to get a little fuzzy. But there are ways around this, Anuff said, and part of it comes from RAG.
RAG is a natural language processing (NLP) process that merges knowledge retrieval-based AI with generative AI. RAG can also process and consolidate data from an internal knowledge base to return contextual, natural language answers, rather than simply summarizing.
“A model (a large language) is good for two things. First, there’s an incredible language factor in terms of understanding what you said and what you meant,” Anuff said. “The second element is that it is also a knowledge base. How much of his own knowledge he uses is something the programmer decides. You tell the model, limit the response you generate as much as possible to this information that I have provided. You do something called grounding. And that means the chances of hallucinations are greatly reduced because the model doesn’t go off on a tangent. Essentially, it involves using the model’s language faculty to reorganize the content it already had. This is why RAG and RAG variance have become such a powerful element in reducing hallucinations.
The second and most important reason why RAG is essential is that it is how you get your business data in real time accurately, securely, and securely into the model at inference time , he added.
“There are other techniques to get your data, but they’re not safe, they’re not real-time, they’re not secure,” he said. “That’s why you’re going to see this model database coupling for a long time to come, whether we call it RAG or not.”