Oreilly – Build a Large Language Model from Scratch (early access), Video Edition 2024-7

Published on: 2024-09-04 19:39:02

Categories: 28

Description

Build a Large Language Model from Scratch course (early access), Video Edition. This video tutorial guides you to create, train, and configure a large language model (LLM) from scratch. In this informative book, bestselling author Sebastian Raschka guides you step-by-step through creating your LLM, explaining each step with clear text, diagrams, and examples. You’ll go from initial design and creation to pre-training on a general set and then fine-tuning for specific tasks. Large language models (LLMs) powering advanced AI tools like ChatGPT, Bard, and Copilot seem like a miracle, but they’re not magic. This book makes LLMs pointless by helping you build your own LLM from scratch. You will gain a unique and valuable insight into how LLMs work, learn how to assess their quality and pick up specific techniques to fine-tune and improve them.

What you will learn

Designing and coding all parts of an LLM
Preparation of a suitable dataset for LLM training
Fine-tune LLMs for text classification with your own data
Apply instruction fine-tuning techniques to ensure your LLM follows instructions
Loading preset weights into an LLM

This course is suitable for people who

Looking to learn how to build and train large language models.
Interested in better understanding how LLMs work.
They want to improve their skills in the field of machine learning and artificial intelligence.
Looking for a practical guide to building your LLM.

Course specifications Build a Large Language Model from Scratch (early access), Video Edition

Publisher: Oreilly
Lecturer: Sebastian Raschka
Training level: beginner to advanced
Training duration: 8 hours and 12 minutes

Course headings

Chapter 1. Understanding Large Language Models
Chapter 1. Applications of LLMs
Chapter 1. Stages of building and using LLMs
Chapter 1. Introducing the transformer architecture
Chapter 1. Utilizing large datasets
Chapter 1. A closer look at the GPT architecture
Chapter 1. Building a large language model
Chapter 1. Summary
Chapter 2. Working with Text Data
Chapter 2. Tokenizing text
Chapter 2. Converting tokens into token IDs
Chapter 2. Adding special context tokens
Chapter 2. Byte pair encoding
Chapter 2. Data sampling with a sliding window
Chapter 2. Creating token embeddings
Chapter 2. Encoding word positions
Chapter 2. Summary
Chapter 3. Coding Attention Mechanisms
Chapter 3. Capturing data dependencies with attention mechanisms
Chapter 3. Attending to different parts of the input with self-attention
Chapter 3. Implementing self-attention with trainable weights
Chapter 3. Hiding future words with causal attention
Chapter 3. Extending single-head attention to multi-head attention
Chapter 3. Summary
Chapter 4. Implementing a GPT model from scratch to generate text
Chapter 4. Normalizing activations with layer normalization
Chapter 4. Implementing a feed forward network with GELU activations
Chapter 4. Adding shortcut connections
Chapter 4. Connecting attention and linear layers in a transformer block
Chapter 4. Coding the GPT model
Chapter 4. Generating text
Chapter 4. Summary
Chapter 5. Pretraining on Unlabeled Data
Chapter 5. Training an LLM
Chapter 5. Decoding strategies to control randomness
Chapter 5. Loading and saving model weights in PyTorch
Chapter 5. Loading pretrained weights from OpenAI
Chapter 5. Summary
Chapter 6. Finetuning for Classification
Chapter 6. Preparing the dataset
Chapter 6. Creating data loaders
Chapter 6. Initializing a model with pretrained weights
Chapter 6. Adding a classification head
Chapter 6. Calculating the classification loss and accuracy
Chapter 6. Finetuning the model on supervised data
Chapter 6. Using the LLM as a spam classifier
Chapter 6. Summary
Chapter 7. Finetuning to Follow Instructions
Chapter 7. Preparing a dataset for supervised instruction fine tuning
Chapter 7. Organizing data into training batches
Chapter 7. Creating data loaders for an instruction dataset
Chapter 7. Loading a pretrained LLM
Chapter 7. Finetuning the LLM on instruction data
Chapter 7. Extracting and saving responses
Chapter 7. Evaluating the fine-tuned LLM
Chapter 7. Conclusions
Chapter 7. Summary
Appendix A. Introduction to PyTorch
Appendix A. Understanding tensors
Appendix A. Seeing models as computation graphs
Appendix A. Automatic differentiation made easy
Appendix A. Implementing multilayer neural networks
Appendix A. Setting up efficient data loaders
Appendix A. A typical training loop
Appendix A. Saving and loading models
Appendix A. Optimizing training performance with GPUs
Appendix A. Summary
Appendix A. Further reading
Appendix A. Exercise answers
Appendix D. Adding Bells and Whistles to the Training Loop
Appendix D. Cosine decay
Appendix D. Gradient clipping
Appendix D. The modified training function
Appendix E. Parameter-efficient finetuning with LoRA
Appendix E. Preparing the dataset
Appendix E. Initializing the model
Appendix E. Parameter-efficient finetuning with LoRA

Course images

Build a Large Language Model from Scratch (early access), Video Edition

Sample video of the course

Video Player

00:00

Use Up/Down Arrow keys to increase or decrease volume.

Installation guide

After Extract, view with your favorite Player.

Subtitle: None

Quality: 720p

download link

Download part 1 – 1 GB

Download part 2 – 6.3 MB

File(s) password: www.downloadly.ir

File size

1 GB

Sharing is caring: