- Includes a large and diverse set of 1,865 real-world software engineering tasks from 41 repositories.
- Reduces data contamination by combining GPL-licensed public data with proprietary private sets.
- Human-augmented task specifications improve clarity without changing technical difficulty.
- Provides Docker environments for reproducible evaluations.