🚀
10 Days Realtime LLM Bootcamp
  • Introduction
    • Getting Started
    • Course Syllabus
    • Course Structure
    • Prerequisites
    • Greetings from your Instructors
    • First Exercise (Ungraded)
  • Basics of LLM
    • What is Generative AI?
    • What is a Large Language Model?
    • Advantages and Applications of Large Language Models
    • Bonus Resource: Multimodal LLMs
  • Word Vectors Simplified
    • What is a Word Vector
    • Word Vector Relationships
    • Role of Context in LLMs
    • Transforming Vectors into LLM Responses
      • Neural Networks and Transformers (Bonus Module)
      • Attention and Transformers (Bonus Module)
      • Multi-Head Attention and Further Reads (Bonus Module)
    • Let's Track Our Progress
  • Prompt Engineering
    • What is Prompt Engineering
    • Prompt Engineering and In-context Learning
    • Best Practices to Follow in Prompt Engineering
    • Token Limits in Prompts
    • Prompt Engineering Excercise
      • Story for the Excercise: The eSports Enigma
      • Tasks in the Excercise
  • Retrieval Augmented Generation and LLM Architecture
    • What is Retrieval Augmented Generation (RAG)?
    • Primer to RAG Functioning and LLM Architecture: Pre-trained and Fine-tuned LLMs
    • In-Context Learning
    • High level LLM Architecture Components for In-context Learning
    • Diving Deeper: LLM Architecture Components
    • LLM Architecture Diagram and Various Steps
    • RAG versus Fine-Tuning and Prompt Engineering
    • Versatility and Efficiency in Retrieval-Augmented Generation (RAG)
    • Key Benefits of RAG for Enterprise-Grade LLM Applications
    • Similarity Search in Vectors (Bonus Module)
    • Using kNN and LSH to Enhance Similarity Search in Vector Embeddings (Bonus Module)
    • Track your Progress
  • Hands-on Development
    • Prerequisites
    • Dropbox Retrieval App in 15 Minutes
      • Building the app without Dockerization
      • Understanding Docker
      • Using Docker to Build the App
    • Amazon Discounts App
      • How the Project Works
      • Step-by-Step Process
    • How to Run the Examples
  • Live Interactions with Jan Chorowski and Adrian Kosowski | Bonus Resource
  • Final Project + Giveaways
    • Prizes and Giveaways
    • Tracks for Submission
    • Final Submission
Powered by GitBook
On this page
  1. Retrieval Augmented Generation and LLM Architecture

What is Retrieval Augmented Generation (RAG)?

PreviousRetrieval Augmented Generation and LLM ArchitectureNextPrimer to RAG Functioning and LLM Architecture: Pre-trained and Fine-tuned LLMs

Last updated 1 year ago

Large Language Models (LLMs) like GPT-4 or Mistral-7b are extraordinary in many ways, yet they come with a set of challenges.

For now, let's focus on one specific limitation: the timeliness of their data. Since these models are trained up to a particular cut-off date, they aren't well-suited for real-time or organization-specific information.

Imagine you're a developer architecting an LLM-enabled app for Amazon. You aim to support shoppers as they comb through Amazon for the latest deals on jackets. Naturally, you want to furnish them with the most current offers available. After all, nobody wants to rely on outdated information, and the same holds true for data queried from your LLM.

This is where Retrieval-Augmented Generation, commonly known as RAG, significantly improves the capabilities of LLMs.

In a way that might resemble a resourceful friend in an exam setting or during a speech who—figuratively speaking, of course—swiftly passes you the most relevant "cue card" out of a ton of information to help you understand what you should be writing or saying next.

With RAG, efficient retrieval of the most relevant data for your use case ensures the text generated is both current and substantiated.

RAG, as its name indicates, operates through a three-fold process:

  • Retrieval: It sources pertinent information.

  • Augmentation: This information is then added to the model's initial input.

  • Generation: Finally, the LLM utilizes this augmented input to create an informed output.

Simply put, RAG empowers LLMs to include real-time, reliable data from external databases in their generated text.

For a better explanation, check out this video by Marina Danilevsky, Senior Research Staff Member at IBM Research. She shares two important challenges with LLMs resolved with the help of Retrieval Augmented Generation.

(Credits: IBM Technology)

Knowing about RAG is essential, particularly if considering implementing an Enterprise LLM Architecture for your organization or a personal project. The efficacy of this architecture depends mainly on the strength and utility of RAG.

By now, you should have a basic understanding of RAG and its importance. As we progress through this module, we aim for you to gain a comprehensive grasp of how in-context learning and RAG collectively contribute to making LLMs more effective, current, and enterprise-ready.

Perhaps that wasn't the perfect example, but you get the point.
😄