
Oreilly – Build a Large Language Model from Scratch (early access), Video Edition 2024-7
Published on: 2024-09-04 19:39:02
Categories: 28
Description
Build a Large Language Model from Scratch course (early access), Video Edition. This video tutorial guides you to create, train, and configure a large language model (LLM) from scratch. In this informative book, bestselling author Sebastian Raschka guides you step-by-step through creating your LLM, explaining each step with clear text, diagrams, and examples. You’ll go from initial design and creation to pre-training on a general set and then fine-tuning for specific tasks. Large language models (LLMs) powering advanced AI tools like ChatGPT, Bard, and Copilot seem like a miracle, but they’re not magic. This book makes LLMs pointless by helping you build your own LLM from scratch. You will gain a unique and valuable insight into how LLMs work, learn how to assess their quality and pick up specific techniques to fine-tune and improve them.
What you will learn
- Designing and coding all parts of an LLM
- Preparation of a suitable dataset for LLM training
- Fine-tune LLMs for text classification with your own data
- Apply instruction fine-tuning techniques to ensure your LLM follows instructions
- Loading preset weights into an LLM
This course is suitable for people who
- Looking to learn how to build and train large language models.
- Interested in better understanding how LLMs work.
- They want to improve their skills in the field of machine learning and artificial intelligence.
- Looking for a practical guide to building your LLM.
Course specifications Build a Large Language Model from Scratch (early access), Video Edition
- Publisher: Oreilly
- Lecturer: Sebastian Raschka
- Training level: beginner to advanced
- Training duration: 8 hours and 12 minutes
Course headings
- Chapter 1. Understanding Large Language Models
- Chapter 1. Applications of LLMs
- Chapter 1. Stages of building and using LLMs
- Chapter 1. Introducing the transformer architecture
- Chapter 1. Utilizing large datasets
- Chapter 1. A closer look at the GPT architecture
- Chapter 1. Building a large language model
- Chapter 1. Summary
- Chapter 2. Working with Text Data
- Chapter 2. Tokenizing text
- Chapter 2. Converting tokens into token IDs
- Chapter 2. Adding special context tokens
- Chapter 2. Byte pair encoding
- Chapter 2. Data sampling with a sliding window
- Chapter 2. Creating token embeddings
- Chapter 2. Encoding word positions
- Chapter 2. Summary
- Chapter 3. Coding Attention Mechanisms
- Chapter 3. Capturing data dependencies with attention mechanisms
- Chapter 3. Attending to different parts of the input with self-attention
- Chapter 3. Implementing self-attention with trainable weights
- Chapter 3. Hiding future words with causal attention
- Chapter 3. Extending single-head attention to multi-head attention
- Chapter 3. Summary
- Chapter 4. Implementing a GPT model from scratch to generate text
- Chapter 4. Normalizing activations with layer normalization
- Chapter 4. Implementing a feed forward network with GELU activations
- Chapter 4. Adding shortcut connections
- Chapter 4. Connecting attention and linear layers in a transformer block
- Chapter 4. Coding the GPT model
- Chapter 4. Generating text
- Chapter 4. Summary
- Chapter 5. Pretraining on Unlabeled Data
- Chapter 5. Training an LLM
- Chapter 5. Decoding strategies to control randomness
- Chapter 5. Loading and saving model weights in PyTorch
- Chapter 5. Loading pretrained weights from OpenAI
- Chapter 5. Summary
- Chapter 6. Finetuning for Classification
- Chapter 6. Preparing the dataset
- Chapter 6. Creating data loaders
- Chapter 6. Initializing a model with pretrained weights
- Chapter 6. Adding a classification head
- Chapter 6. Calculating the classification loss and accuracy
- Chapter 6. Finetuning the model on supervised data
- Chapter 6. Using the LLM as a spam classifier
- Chapter 6. Summary
- Chapter 7. Finetuning to Follow Instructions
- Chapter 7. Preparing a dataset for supervised instruction fine tuning
- Chapter 7. Organizing data into training batches
- Chapter 7. Creating data loaders for an instruction dataset
- Chapter 7. Loading a pretrained LLM
- Chapter 7. Finetuning the LLM on instruction data
- Chapter 7. Extracting and saving responses
- Chapter 7. Evaluating the fine-tuned LLM
- Chapter 7. Conclusions
- Chapter 7. Summary
- Appendix A. Introduction to PyTorch
- Appendix A. Understanding tensors
- Appendix A. Seeing models as computation graphs
- Appendix A. Automatic differentiation made easy
- Appendix A. Implementing multilayer neural networks
- Appendix A. Setting up efficient data loaders
- Appendix A. A typical training loop
- Appendix A. Saving and loading models
- Appendix A. Optimizing training performance with GPUs
- Appendix A. Summary
- Appendix A. Further reading
- Appendix A. Exercise answers
- Appendix D. Adding Bells and Whistles to the Training Loop
- Appendix D. Cosine decay
- Appendix D. Gradient clipping
- Appendix D. The modified training function
- Appendix E. Parameter-efficient finetuning with LoRA
- Appendix E. Preparing the dataset
- Appendix E. Initializing the model
- Appendix E. Parameter-efficient finetuning with LoRA
Course images

Sample video of the course
Installation guide
After Extract, view with your favorite Player.
Subtitle: None
Quality: 720p
download link
Download part 1 – 1 GB
Download part 2 – 6.3 MB
File(s) password: www.downloadly.ir
File size
1 GB
Leave a Comment (Please sign to comment)