Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/course/en/chapter11/section3.ipynb
Views: 2935
Supervised Fine-Tuning with SFTTrainer
This notebook demonstrates how to fine-tune the HuggingFaceTB/SmolLM2-135M
model using the SFTTrainer
from the trl
library. The notebook cells run and will finetune the model. You can select your difficulty by trying out different datasets.
Exercise: Fine-Tuning SmolLM2 with SFTTrainer
Take a dataset from the Hugging Face hub and finetune a model on it.
Difficulty Levels
🐢 Use the `HuggingFaceTB/smoltalk` dataset
🐕 Try out the `bigcode/the-stack-smol` dataset and finetune a code generation model on a specific subset `data/python`.
🦁 Select a dataset that relates to a real world use case your interested in
Generate with the base model
Here we will try out the base model which does not have a chat template.
Dataset Preparation
We will load a sample dataset and format it for training. The dataset should be structured with input-output pairs, where each input is a prompt and the output is the expected response from the model.
TRL will format input messages based on the model's chat templates. They need to be represented as a list of dictionaries with the keys: role
and content
,.
Configuring the SFTTrainer
The SFTTrainer
is configured with various parameters that control the training process. These include the number of training steps, batch size, learning rate, and evaluation strategy. Adjust these parameters based on your specific requirements and computational resources.
Training the Model
With the trainer configured, we can now proceed to train the model. The training process will involve iterating over the dataset, computing the loss, and updating the model's parameters to minimize this loss.
Bonus Exercise: Generate with fine-tuned model
🐕 Use the fine-tuned to model generate a response, just like with the base example..
💐 You're done!
This notebook provided a step-by-step guide to fine-tuning the HuggingFaceTB/SmolLM2-135M
model using the SFTTrainer
. By following these steps, you can adapt the model to perform specific tasks more effectively. If you want to carry on working on this course, here are steps you could try out:
Try this notebook on a harder difficulty
Review a colleagues PR
Improve the course material via an Issue or PR.