Broad Fine-Tuning Support
Supports full fine-tuning, pretraining, and quantized training at 4-bit, 8-bit, 16-bit, and FP8 precision levels.
Reinforcement Learning Optimization
Efficient reinforcement learning implementations like GRPO, GSPO, DrGRPO, and DAPO with up to 80% VRAM savings.
Exact Computation with Dynamic Quantization
Maintains 0% accuracy loss by using exact methods and dynamic 4-bit quantization that selectively skips quantizing certain parameters for higher accuracy.
Multi-Hardware and Model Compatibility
Works across NVIDIA CUDA 7.0+, AMD, and Intel GPUs and supports all Hugging Face Transformer-compatible models including TTS, vision, embedding, and multimodal types.
Integration and Export
Integrates with Hugging Face Trainer interface and exports models to formats such as GGUF, llama.cpp, and vLLM for deployment.