Multi-Cloud and On-Premises Support
Supports running AI workloads on over 20 cloud providers and Kubernetes clusters without rewriting job configurations.
Jobs as Code
Users define environments and jobs in YAML or via CLI, enabling portable and reproducible execution across infrastructures.
Automated Resource Management
Automates compute selection, provisioning, and management including spot/preemptible instance usage for cost efficiency.
Job Queuing and Auto-Recovery
Manages multiple jobs with queuing, running, and automatic recovery to handle failures without manual intervention.
Modular Installation
Installable via pip with cloud-specific extras to include only needed providers, e.g., `skypilot[kubernetes,aws,gcp]`.