1
Install TorchTitan
Install PyTorch, then install TorchTitan from source or nightly builds following instructions on the GitHub repository.
2
Download Hugging Face Assets
Obtain a Hugging Face API token and run the provided script to download Llama 3.1 tokenizer assets.
3
Prepare Dataset
Use the integrated checkpointable data loader to prepare datasets such as the C4 variant.
4
Configure Training
Set training parameters like batch size and parallelism in the TOML configuration file.
5
Launch Training and Monitor
Start training and monitor metrics using TensorBoard or Weights & Biases dashboards.