Key Features

What you can do

LightningModule abstraction

Enables complex interactions of PyTorch nn.Module objects within training, validation, and testing steps.

Multi-GPU/TPU/HPU training

Supports distributed training on multiple GPUs, TPUs, and HPUs without code modifications.

Built-in testing

Provides integrated testing capabilities to avoid the need for custom test implementations.

Trainer class

Automates training loop details and supports plugins for various backends, precision libraries, and clusters.

Precision control

Supports 64-bit, 32-bit, and 16-bit floating point operations with regular and mixed precision settings.

Checkpoint management

Enables saving and loading of model checkpoints for reproducibility and reuse.