Getting Started

This course is meticulously crafted for beginners grounded in programming, especially with a forte in Python. Prior expertise in LLMs or generative AI is not necessary, as our curriculum ensures a progressive learning experience.

Kick-off Session Recording (Optional)

Suppose you didn't attend the kick-off session of the bootcamp live. In that case, you can watch the 30-minute recorded interaction below between Mudit Srivastava from Pathway and Navyansh Mahla from the AI Community at IIT Bombay.

However, reading the course introduction is highly recommended. If you prefer, you can read the course introduction and then return to the kick-off session's video if you have any questions or need further clarification.

Course Progression

  • Starting with the Basics: Our journey commences by laying the foundational bedrock of Large Language Models and Generative AI.

  • Intermediate Exploration: As we navigate deeper waters, we touch upon essential components like vector indices, word vectors, and the intricate art of prompt engineering.

  • Advanced Insights: The final stretch of our curriculum focuses on in-depth topics such as RAG, LLM Architecture, LLM Pipelines, and hands-on development exercises.

Note: Consider your existing knowledge when tackling this coursework. If you're starting from scratch, this coursework could be a bit overwhelming to be completed in 10 days, especially with the hands-on project and bonus resources. These timelines were implemented back when the cohort was active, taking into account the academic semester exams schedule and an upcoming R&D expo at IIT Bombay. However, if you're taking a self-paced approach as an independent learner, take the time to assess your current understanding, explore the curriculum, and create a practical schedule that aligns with your existing knowledge and capacity. Happy learning!

The power of Real-time Data and Large Language Models

A pivotal theme of this course is the amalgamation of real-time data with Large Language Models. Harnessing this combination can lead to transformative solutions addressing pressing societal and business challenges. While you will pretty much be able to build custom LLM Apps for static data sources as well, our chosen open-source framework effortlessly supports both real-time ("Streaming") and static ("Batch") data with a slight change in Python code.

In our digital age, the fusion of immediate data with LLMs is transformative. It accelerates processes, from financial transactions to healthcare responses. Think about it: a financial transaction that once took days can now be executed in mere milliseconds, a testament to the immense power of real-time data. By integrating real-time data streams with LLMs, we can develop applications that not only respond promptly but also have the potential to drive meaningful impacts for humanity. And that's pretty much one of the key goals of this course.

Learning Beyond the Curriculum

To enrich your learning experience, we've included an insightful session with deep learning reference figure Jan Chorowski. This ensures a comprehensive learning journey, equipping you with a spectrum of knowledge, irrespective of your starting point. You can access it eventually as you progress within the coursework.

Your Role as a Learner

While we lay the groundwork and provide the tools, the epicenter of learning and exploration resides with you. We set the stage, but the performance—the application, the innovation, the breakthroughs—depends on your initiative and drive.

As we chart this transformative journey, are you poised to harness the untapped potentials of LLMs combined with real-time data for the greater good? Let's embark on this venture together! 🚀

Get to Know Pathway – our collaborators for this course

Pathway is the world's fastest data processing engine (benchmarks), supporting unified workflows for batch, streaming data, and LLM applications. It is the single, fastest integrated data processing layer for real-time intelligence. With Pathway, you can

  • Mix-and-match: batch, streaming, API calls, including LLMs.

  • Effortless transition from batch to real-time data processing - just like setting a flag in your Spark code.

  • Reduce the cost of any computations – it is powered by a highly efficient and scalable Rust engine.

  • Enable use cases enterprises crave, and make advanced data transformations fast and easy to implement.

Discover on GitHub: Pathway's Data Processing Framework

Discover on GitHub: Pathway's RAG Framework for Large Language Models (LLMs)

Why is Pathway so Fast

The Pathway engine is built in Rust. 🦀. Rust is built for speed, parallel computation and low-level control over hardware resources. This allows them to execute maximum optimization for performance and speed.

The team at Pathway also realize our love for Python 🐍. This is why the Pathway engine is made in a way that when you write your data processing code in Python, Pathway will automatically compile it into a Rust dataflow. In other words, when using Pathway, you don’t need to know anything about Rust to enjoy its enormous performance benefits! For now, this is a simple enough starting point (that said, feel free to find more details in Pathway's ArXiv Paper – your first bonus resource 🙂).

Last updated