🚀
10 Days Realtime LLM Bootcamp
  • Introduction
    • Getting Started
    • Course Syllabus
    • Course Structure
    • Prerequisites
    • Greetings from your Instructors
    • First Exercise (Ungraded)
  • Basics of LLM
    • What is Generative AI?
    • What is a Large Language Model?
    • Advantages and Applications of Large Language Models
    • Bonus Resource: Multimodal LLMs
  • Word Vectors Simplified
    • What is a Word Vector
    • Word Vector Relationships
    • Role of Context in LLMs
    • Transforming Vectors into LLM Responses
      • Neural Networks and Transformers (Bonus Module)
      • Attention and Transformers (Bonus Module)
      • Multi-Head Attention and Further Reads (Bonus Module)
    • Let's Track Our Progress
  • Prompt Engineering
    • What is Prompt Engineering
    • Prompt Engineering and In-context Learning
    • Best Practices to Follow in Prompt Engineering
    • Token Limits in Prompts
    • Prompt Engineering Excercise
      • Story for the Excercise: The eSports Enigma
      • Tasks in the Excercise
  • Retrieval Augmented Generation and LLM Architecture
    • What is Retrieval Augmented Generation (RAG)?
    • Primer to RAG Functioning and LLM Architecture: Pre-trained and Fine-tuned LLMs
    • In-Context Learning
    • High level LLM Architecture Components for In-context Learning
    • Diving Deeper: LLM Architecture Components
    • LLM Architecture Diagram and Various Steps
    • RAG versus Fine-Tuning and Prompt Engineering
    • Versatility and Efficiency in Retrieval-Augmented Generation (RAG)
    • Key Benefits of RAG for Enterprise-Grade LLM Applications
    • Similarity Search in Vectors (Bonus Module)
    • Using kNN and LSH to Enhance Similarity Search in Vector Embeddings (Bonus Module)
    • Track your Progress
  • Hands-on Development
    • Prerequisites
    • Dropbox Retrieval App in 15 Minutes
      • Building the app without Dockerization
      • Understanding Docker
      • Using Docker to Build the App
    • Amazon Discounts App
      • How the Project Works
      • Step-by-Step Process
    • How to Run the Examples
  • Live Interactions with Jan Chorowski and Adrian Kosowski | Bonus Resource
  • Final Project + Giveaways
    • Prizes and Giveaways
    • Tracks for Submission
    • Final Submission
Powered by GitBook
On this page
  • Up Next: Explainer by Mike Chambers from AWS
  • Bonus Links
  1. Word Vectors Simplified

Transforming Vectors into LLM Responses

PreviousRole of Context in LLMsNextNeural Networks and Transformers (Bonus Module)

Last updated 1 year ago

Alright, you've got a handle on word vectors and the role of context. Before we delve into the comprehensive pipelines that make Large Language Models (LLMs) function, it's crucial to understand the roles of tokenisers and detokenisers.

Think of a tokenizer as a "sentence chopper." It breaks down a sentence into smaller parts, like words, characters, subwords, or symbols, which the model can understand. This generally depends on the type and the size of the model. Detokenizers do the reverse; they take the LLM's output and stitch it back into sentences we can understand. This process is a foundational step for LLMs to translate human queries into actionable tasks.

Up Next: Explainer by Mike Chambers from AWS

To give you a more concrete understanding, our next resource is an insightful video by Mike Chambers. This video demystifies what happens when you send a ‘prompt’ (or an input text) to an LLM.

Though the internal mathematics may be intricate, the overall goal is straightforward: word prediction. The video will guide you through how your prompts are processed to generate coherent text responses. This will set the stage for our subsequent discussions on Prompt Engineering and LLM pipelines, offering a cohesive picture of how these models operate.

(Credits: Build on AWS)

Here you see how a Large Language Model’s job is to predict the next word based on the context.

Now that you understand the role of "context," you might want to grasp some concepts to appreciate how these models work at a granular level. Given the timelines of this course, while foundational for a better understanding of LLMs, these concepts are tagged as Bonus Resources.

  • Attention in Large Language Models: Imagine being in a room where multiple conversations are happening. Your ability to focus on one conversation over the others is similar to how Attention works in neural networks. It allows the model to 'focus' on relevant parts of the input for tasks.

  • Encoder-Decoder Architecture: An encoder translates the input (e.g., a sentence) into a fixed-size context vector. The decoder takes this context vector to generate an output sequence (e.g., a translated sentence). When the attention mechanism is in action, it guides the Decoder to focus on certain parts of the Encoder’s output, enhancing the translation or text generation task. The concept of Attention complements the Encoder-Decoder architecture, making it more effective and efficient. This architecture is a building block for LLMs such as GPT-3.5.

Bonus Links

If you're interested in delving further into the details, you may find the following links on embeddings, attention mechanisms, and encoder-decoder architecture beneficial. A foundational understanding of neural networks, backpropagation, the softmax function, and cross-entropy will enhance your comprehension of these resources. These topics are not the primary focus of this course, so they're provided as bonus links.

  • Understanding Transformers: Check the Bonus Module Right Ahead.

| Short Video by Google Cloud

| Video by StatQuest

| Video by StatQuest

| Read the Paper on ArXiv

| Watch the seminar by Stanford Online

Attention mechanism: Overview
Word2Vec and Word Embeddings
Seq2Seq Encoder-Decoder Neural Networks
Attention is all you need
Attention is all you need