How RAG-Based AI Applications Positively Impact Business

Jyotishko BiswasHP, Head of AI – HP Finance

Getty Images

The advent of transformers and large language models (LLMs) has significantly improved the accuracy, relevance, and speed to market of AI applications. As the core technology of LLMs, transformers enable LLMs to predict and generate the next word (to be specific tokens) by learning from large datasets containing billions of words. This results in significant improvements in accuracy, appropriateness, and consistency. However, LLMs still have some shortcomings, and this is where recovery-augmented generation (RAG) becomes essential.

How RAG Fills the Transformer Gaps

Transformers are limited by the data they are trained on. For example, if they are trained on web data up to 2022, they cannot answer questions about events that occurred in 2024. Additionally, transformers can generate non-factual answers, called hallucinations, which compromise their reliability.

RAG is a technique in which an LLM is connected to an external, updatable database. It fills the data gaps of the processors by providing domain-specific and up-to-date information to the LLMs, allowing them to answer questions about recent events and significantly reduce hallucinations.

Limitations of Augmented Generation by Standard Recovery

Despite its advantages, the standard RAG has its limitations, which are described below.

1. It retrieves additional information even when the prompt is simple and retrieval is not necessary, resulting in higher computational and memory costs.

2. The information retrieved may be irrelevant, jeopardizing the quality of the LLM result, without any relevance check.

3. Only a certain number of top-notch documents are used, leaving out potentially useful information.

4. Similarity checks between the prompt and the retrieved documents are often insufficient; the usefulness of the retrieved documents is more critical.

5. Extracting relevant information from vector databases has limitations.

6. Leakage of private and sensitive information in LLM results remains a concern.

RAG advancements that help overcome these limitations

Many technological advancements have been made in the last two to three years to overcome the challenges of standard RAG.

Auto-RAG is one such advance. It addresses the issues of need for retrieval, relevance of retrieved documents, and quality of LLM output. It includes a critical LLM that determines whether retrieval is necessary based on the prompt. For simple prompts, such as “What is the capital of the United States?”, retrieval may not be necessary.

The critical LLM also evaluates the relevance of the retrieved documents, retaining only those that are relevant. This ensures that the main LLM uses relevant information, resulting in more accurate and consistent results.

Additionally, unlike standard RAG, which retrieves information once per prompt, Self-RAG can perform multiple retrievals per prompt, ensuring that more relevant information is provided.

Another advanced RAG approach is called MetRAG (“an augmented retrieval generation framework enhanced by multi-layered thoughts”), which addresses the main challenges of RAG. This method uses an additional LLM to evaluate the usefulness of retrieved documents rather than relying solely on similarity.

For example, let’s take the example of the question “Tell me about the famous football player Cristiano Ronaldo.” Document D1 states: “Cristiano Ronaldo is a famous football player,” while Document D2 states: “Cristiano Ronaldo is a Portuguese professional footballer, he plays as a striker and is captain of Saudi Pro League club Al Nassr and the Portugal national team.”

Similarity checking can give D1 a better ranking, but D2 contains more useful information. This shows that similarity does not always retrieve the most useful information; therefore, document usefulness is used to identify relevance.

Additionally, in MetRAG, another LLM summarizes the retrieved documents, thus avoiding information loss by ensuring that relevant details of non-priority documents are retained. This differs from the standard RAG, where only the retrieved priority documents are retained and the others are discarded. This approach of summarizing all retrieved documents results in a more accurate and comprehensive final result.

Another advanced RAG approach uses knowledge graphs instead of vector databases to store external information. While vector databases struggle to handle complex, multi-relational data, knowledge graphs excel by storing information as entities and their relationships. For example, in the sentence “Argentina won the 2022 FIFA World Cup,” “Argentina” and “2022 FIFA World Cup” are entities and “won” is the relationship.

Storing external information in a knowledge graph instead of a vector database allows RAG to retrieve more relevant information. This leads the core LLM to produce more accurate, relevant, and consistent results.

Protection of sensitive data

One of the main challenges of LLM models is the potential leakage of sensitive information or its use in training models. To address this issue, RAG can be enhanced by including checks to identify whether the retrieved documents contain sensitive and private information.

Another LLM, specially trained to recognize sensitive and personal data, can be employed to examine the retrieved information. If a document contains sensitive information, it is either excluded or anonymized/pseudonymized.

Preventing sensitive information leaks is essential in the healthcare industry, which handles extremely sensitive patient data, as noted in This item According to Suresh Martha: “The pharmaceutical industry is undergoing a significant transformation, driven by the integration of artificial intelligence (AI) and newer generative AI (GenAI) into aspects of drug discovery, clinical trials and patient care. While these advances promise substantial benefits, from accelerating drug development to delivering more personalized medical treatments, the GenAI revolution raises ethical considerations around data protection, privacy and responsible use of the technology.”

Limitations of Advanced RAG Systems

Although advanced RAG systems overcome many of the challenges of standard RAG, they come with their own set of limitations.

1. Additional processing load and increased latency due to the retrieval process may limit the use of RAG in low latency applications.

2. Long context resulting from adding retrieved documents to the prompt might restrict the use of LLM with shorter context lengths.

3. Although advanced RAG systems such as Self-RAG and knowledge graph-based RAG reduce the retrieval of irrelevant documents, further improvements are still needed.

Conclusion

Recent technological advances in RAG have improved various aspects of RAG-based applications while reducing computational and memory costs. Code to implement many of these techniques is available on GitHub, Hugging Face, and other repositories. However, despite these advances, gaps still exist and global research is underway to fill them.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs, and technology leaders. Am I eligible?

Latest News

Machine learning at the Flatiron Institute

New urgent Gmail security warning for billions as attacks continue

Saskatchewan Polytechnic becomes a founding member of AI Saskatchewan – DiscoverMooseJaw.com

New urgent Gmail security warning for billions as attacks continue

AI and Cybersecurity Industry in Middle East and Africa to See Tremendous Success

AI, 5G and Quantum: risks linked to innovation and cybersecurity

From quantum threats to AI defenses: how cybersecurity will evolve in 2025

New urgent Gmail security warning for billions as attacks continue

AI and Cybersecurity Industry in Middle East and Africa to See Tremendous Success

AI, 5G and Quantum: risks linked to innovation and cybersecurity

From quantum threats to AI defenses: how cybersecurity will evolve in 2025

ChatGPT and AI tools gain ground in the search market

AI is great, but agencies need to remember that in 2025 they will be in marketing

Marketing and AI integrations: marketing experiences

Why AI Could Be the Best Thing to Happen to Marketing

ChatGPT and AI tools gain ground in the search market

AI is great, but agencies need to remember that in 2025 they will be in marketing

Marketing and AI integrations: marketing experiences

Why AI Could Be the Best Thing to Happen to Marketing

7 Google AI announcements from October

Instagram concerned about challenge of distinguishing real images from AI-generated images, Apple to launch foldable iPhone by 2026 and beyond: Consumer Tech News (Dec. 16-20) – Apple (NASDAQ: AAPL), Amazon.com (NASDAQ:AMZN)

AI is bad news for the Global South

China’s Shenzhen technology center issues ‘vouchers’ to support AI research and development

7 Google AI announcements from October

Instagram concerned about challenge of distinguishing real images from AI-generated images, Apple to launch foldable iPhone by 2026 and beyond: Consumer Tech News (Dec. 16-20) – Apple (NASDAQ: AAPL), Amazon.com (NASDAQ:AMZN)

AI is bad news for the Global South

China’s Shenzhen technology center issues ‘vouchers’ to support AI research and development

Machine learning at the Flatiron Institute

Exploring the Power of AI and ML in Smart Grids: Advances, Applications and Challenges

Unsupervised ML 17 — Future Trends in Unsupervised Machine Learning: What’s Next? | by Ayşe Kübra Kuyucu | December 2024

FrontiersMachine learning applications in search of life beyond EarthMachine learning (ML) and artificial intelligence (AI) have moved beyond niche applications to become transformative and essential tools for analyzing data….2 days

Machine learning at the Flatiron Institute

Exploring the Power of AI and ML in Smart Grids: Advances, Applications and Challenges

Unsupervised ML 17 — Future Trends in Unsupervised Machine Learning: What’s Next? | by Ayşe Kübra Kuyucu | December 2024

FrontiersMachine learning applications in search of life beyond EarthMachine learning (ML) and artificial intelligence (AI) have moved beyond niche applications to become transformative and essential tools for analyzing data….2 days

What is artificial intelligence (AI) in business?

The impact of AI on the corporate banking sector will reach $250.3 million by 2033!

INSEAD KnowledgeHow Businesses Can Survive AI "Black hole"Why a balanced, experimental approach to GenAI could offer businesses the best chance of navigating an uncertain future.

Latest News

Subscribe to Updates

How RAG-Based AI Applications Positively Impact Business

How RAG Fills the Transformer Gaps

Limitations of Augmented Generation by Standard Recovery

RAG advancements that help overcome these limitations

Protection of sensitive data

Limitations of Advanced RAG Systems

Conclusion

Related Posts

Subscribe to Updates