🤗 x 🦾: Training SmolVLA with LeRobot Notebook

Welcome to the LeRobot SmolVLA training notebook! This notebook provides a ready-to-run setup for training imitation learning policies using the 🤗 LeRobot library.

In this example, we train an SmolVLA policy using a dataset hosted on the Hugging Face Hub, and optionally track training metrics with Weights & Biases (wandb).

⚙️ Requirements

A Hugging Face dataset repo ID containing your training data (--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET)
Optional: A wandb account if you want to enable training visualization
Recommended: GPU runtime (e.g., NVIDIA A100) for faster training

⏱️ Expected Training Time

Training with the SmolVLA policy for 20,000 steps typically takes about 5 hours on an NVIDIA A100 GPU. On less powerful GPUs or CPUs, training may take significantly longer!

Example Output

Model checkpoints, logs, and training plots will be saved to the specified --output_dir. If wandb is enabled, progress will also be visualized in your wandb project dashboard.

Install conda

This cell uses condacolab to bootstrap a full Conda environment inside Google Colab.

In [ ]:

!pip install -q condacolab
import condacolab
condacolab.install()

Install LeRobot

This cell clones the lerobot repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.

In [ ]:

!git clone https://github.com/huggingface/lerobot.git
!conda install ffmpeg=7.1.1 -c conda-forge
!cd lerobot && pip install -e .

This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging.

In [ ]:

!wandb login

Install SmolVLA dependencies

In [ ]:

!cd lerobot && pip install -e ".[smolvla]"

Start training SmolVLA with LeRobot

This cell runs the train.py script from the lerobot library to train a robot control policy.

Make sure to adjust the following arguments to your setup:

--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET: Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., pepijn223/il_gym0.
--batch_size=64: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.
--output_dir=outputs/train/...: Directory where training logs and model checkpoints will be saved.
--job_name=...: A name for this training job, used for logging and Weights & Biases.
--policy.device=cuda: Use cuda if training on an NVIDIA GPU. Use mps for Apple Silicon, or cpu if no GPU is available.
--wandb.enable=true: Enables Weights & Biases for visualizing training progress. You must be logged in via wandb login before running this.

In [ ]:

!cd lerobot && python lerobot/scripts/train.py \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=${HF_USER}/mydataset \
  --batch_size=64 \
  --steps=20000 \
  --output_dir=outputs/train/my_smolvla \
  --job_name=my_smolvla_training \
  --policy.device=cuda \
  --wandb.enable=true

Now after training is done login into the Hugging Face hub and upload the last checkpoint

In [ ]:

!huggingface-cli login

In [ ]:

!huggingface-cli upload ${HF_USER}/my_smolvla \
  /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model

🤗 x 🦾: Training SmolVLA with LeRobot Notebook

⚙️ Requirements

⏱️ Expected Training Time

Example Output

Install conda

Install LeRobot

Install SmolVLA dependencies

Start training SmolVLA with LeRobot

Product

Resources

Company

🤗 x 🦾: Training SmolVLA with LeRobot Notebook

⚙️ Requirements

⏱️ Expected Training Time

Example Output

Install conda

Install LeRobot

Weights & Biases login

Install SmolVLA dependencies

Start training SmolVLA with LeRobot

Login into Hugging Face Hub