- Fierce Network reached out to Gartner analyst and AI expert Bern Elliot to answer some key questions we had about the origins of RAG.
- RAG is a basic technique popularized by Meta AI researchers to improve the quality and relevance of AI-generated content.
- Elliot said the story of RAG begins with transformation algorithms
Retrieval augmented generation (RAG) is a core technique popularized by Meta AI researchers to improve the quality and relevance of AI-generated content by enabling large language models (LLMs) to access sources of external knowledge other than the model training data.
In artificial intelligence (AI), grounding is the ability to connect model results to various external sources of information; By providing models with specific data sources, a grounded model will be able to view relevant information in the same context as its user and deliver more accurate results.
Where does RAG come from?
Fierce Network reached out to Gartner analyst and AI expert Bern Elliot to answer some key questions we had about the origins of RAG.
Despite the surprisingly long history of large language models, as he understands it, the story begins with transformation algorithms.
“There was this thing, transformation algorithms, that you could use to create what were called base models. This technology – the model – had been around for a while, but it just wasn’t very useful because it was too bulky. But as you know, computing power has become much greater recently,” he said.
“The ability to store things in the cloud and increased processing speeds made these algorithms, which were previously too large and cumbersome, possible. And the industry quickly discovered that if you make these kinds of models and make them really big, they come with all kinds of interesting and useful properties,” he explained, before pointing out that many topics regarding RAG were brought to the attention of the industry long before they were presented on the walk.
“Some of these topics have been around for 40 years. So, probabilistic reasoning has been around for 40 years, but some of it is more recent. Computer logic, optimization techniques, and natural language have been around for a long, long time. First it was rule-based, then it started being statistical, and now they’re using large language models to help with natural language processing. Like they do with chatbots. Elliot summarized.
How does RAG work?
RAG model architectures compare user-submitted queries within a database’s knowledge library. An embedding language model is used to assign numerical representations to these searches in the vector database. The embedded prompt then receives relevant information via a user-selected data repository. This augmented prompt is sent to the AI model and its training data, now with the context required to generate an actionable recommendation.
The difference between RAG and fine tuning
Most companies do not train their own AI models, preferring to modify pre-trained models through approaches such as RAG and fine-tuning. Organizations unfamiliar with these terms risk getting lost. It is therefore essential to understand how the two techniques vary. Fine-tuning is the process of methodically adjusting the weights of a model until the AI excels at a certain activity using its training data. RAG, on the other hand, obtains and aggregates information from multiple sources to contextualize the user’s request and produce more relevant results.
Why is RAG important?
LLMs are trained on enormous amounts of data and countless parameters to answer a wide variety of human questions. However, LLMs can be inconsistent. Sometimes they get the answer right, but every once in a while a confused LLM can gather nonsense information from its training data, essentially misleading its user.
This is a rather typical criticism of AI, especially among companies that rely on AI for authoritative answers with citations. But the thing about LLMs is that while they understand how words interact statistically, they have no idea what they mean.
RAG allows you to optimize the outcome of an LLM with targeted information that can be more current than the LLM training data or even specific to a company or sector. Additionally, because RAG models are linked to credible information sources, consumers can independently verify any claims made by the AI.
One of the main advantages of RAG is that it is quite simple to deploy, making it faster and less expensive than retraining a model from scratch with new datasets. This makes RAG ideal for switching between different sources of information as needed.
Some of the major AI offerings that include RAG are Microsoft Azure Machine Learning, OpenAI’s ChatGPT recovery plugin, HuggingFace Transformer plugin, IBM Watsonx.ai and Meta AI.
Read more articles on AI here.