🚀
10 Days Realtime LLM Bootcamp
  • Introduction
    • Getting Started
    • Course Syllabus
    • Course Structure
    • Prerequisites
    • Greetings from your Instructors
    • First Exercise (Ungraded)
  • Basics of LLM
    • What is Generative AI?
    • What is a Large Language Model?
    • Advantages and Applications of Large Language Models
    • Bonus Resource: Multimodal LLMs
  • Word Vectors Simplified
    • What is a Word Vector
    • Word Vector Relationships
    • Role of Context in LLMs
    • Transforming Vectors into LLM Responses
      • Neural Networks and Transformers (Bonus Module)
      • Attention and Transformers (Bonus Module)
      • Multi-Head Attention and Further Reads (Bonus Module)
    • Let's Track Our Progress
  • Prompt Engineering
    • What is Prompt Engineering
    • Prompt Engineering and In-context Learning
    • Best Practices to Follow in Prompt Engineering
    • Token Limits in Prompts
    • Prompt Engineering Excercise
      • Story for the Excercise: The eSports Enigma
      • Tasks in the Excercise
  • Retrieval Augmented Generation and LLM Architecture
    • What is Retrieval Augmented Generation (RAG)?
    • Primer to RAG Functioning and LLM Architecture: Pre-trained and Fine-tuned LLMs
    • In-Context Learning
    • High level LLM Architecture Components for In-context Learning
    • Diving Deeper: LLM Architecture Components
    • LLM Architecture Diagram and Various Steps
    • RAG versus Fine-Tuning and Prompt Engineering
    • Versatility and Efficiency in Retrieval-Augmented Generation (RAG)
    • Key Benefits of RAG for Enterprise-Grade LLM Applications
    • Similarity Search in Vectors (Bonus Module)
    • Using kNN and LSH to Enhance Similarity Search in Vector Embeddings (Bonus Module)
    • Track your Progress
  • Hands-on Development
    • Prerequisites
    • Dropbox Retrieval App in 15 Minutes
      • Building the app without Dockerization
      • Understanding Docker
      • Using Docker to Build the App
    • Amazon Discounts App
      • How the Project Works
      • Step-by-Step Process
    • How to Run the Examples
  • Live Interactions with Jan Chorowski and Adrian Kosowski | Bonus Resource
  • Final Project + Giveaways
    • Prizes and Giveaways
    • Tracks for Submission
    • Final Submission
Powered by GitBook
On this page
  1. Word Vectors Simplified
  2. Transforming Vectors into LLM Responses

Neural Networks and Transformers (Bonus Module)

PreviousTransforming Vectors into LLM ResponsesNextAttention and Transformers (Bonus Module)

Last updated 1 year ago

Now that you have an overview let's dive deeper into this bonus module. This Bonus Module will be easier to understand if you're familiar with Neural Networks, Backpropagation, Sequence-to-sequence learning, and libraries such as NumPy.

In machine learning, Transformers are akin to Optimus Prime in the Transformers movie series. Just as Optimus Prime can transform from a truck into a powerful leader, these models transform simple inputs into complex, insightful outputs, mastering tasks from language translation to code generation.

Transformers are central to revolutionary projects like AlphaFold 2 and NLP giants like GPT-4 and Llama. To truly grasp machine learning's potential, understanding Transformers is essential. Today, we delve slightly into the core of these AI 'Optimus Primes'.

First things first: Neural Networks and RNNs

Before we get into transformers, let's first understand the background. Let's start by getting a quick understanding of neural networks. Imagine them as the brains inside computers, designed to make sense of all sorts of information, whether a picture, a piece of music, or a sentence.

Quick Look at Neural Networks

  • What They Are: Neural networks are like virtual brains in computers. They learn from examples and get good at specific tasks.

  • How They Function: They're made up of layers of 'neurons' that work together to understand and interpret data.

Different Types for Different Tasks

  • Convolutional Neural Networks (CNNs):

    • Role: CNNs are like the eyes of the AI world, great for understanding images.

    • How They Work: They break down images into smaller pieces and learn to recognize patterns, sort of like how we piece together a puzzle.

    • Uses: They're behind the magic of facial recognition and reading handwritten notes.

  • Recurrent Neural Networks (RNNs):

    • Role: RNNs are like the AI's language experts.

    • How They Work: They read sentences word by word, remembering each word as they go, much like reading a book.

    • Challenges: RNNs can get overwhelmed with really long texts, like long paragraphs, and sometimes forget the beginning by the time they reach the end. They also can be challenging to train, with problems like forgetting earlier information (vanishing gradients) or learning too much all at once (exploding gradients).

    • Uses: They help in translating languages or powering chatbots.

Neural networks, like different tools in a toolkit, are specialized for specific kinds of jobs. CNNs excel with visuals, while RNNs handle language.

But, as we'll see next, transformers brought new abilities to the AI world, especially in dealing with language more smartly.

"The transformers in LLMs aren't about me, but they have their own flair!" said no Optimus Prime ever.