Getting Started - torchtitan

1

Install PyTorch, then install TorchTitan from source or nightly builds following instructions on the GitHub repository.

2

Obtain a Hugging Face API token and run the provided script to download Llama 3.1 tokenizer assets.

3

Use the integrated checkpointable data loader to prepare datasets such as the C4 variant.

4

Set training parameters like batch size and parallelism in the TOML configuration file.

5

Start training and monitor metrics using TensorBoard or Weights & Biases dashboards.