Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/course/en/chapter13/grpo_finetune.ipynb
Views: 2935
Kernel: Python 3
Finetune LLMs with GRPO
This notebook shows how to finetune an LLM with GRPO, using the trl
library.
It's by Ben Burtenshaw and Maxime Labonne.
This is a minimal example. For a complete example, refer to the GRPO chapter in the course.
Install dependencies
In [ ]:
Load Dataset
In [ ]:
Load Model
In [ ]:
Define Reward Function
In [ ]:
Define Training Arguments
In [ ]:
Push Model to Hub
In [ ]:
Generate Text
In [ ]:
In [ ]: