What Makes It Special - tao-squared-bench

✨ Provides reproducible simulations for multi-domain customer service evaluation involving user-agent interaction.
✨ Includes updated leaderboards with recent model performance results.
✨ Offers domain-specific configurations and local API documentation for easy inspection.
✨ Actively maintained with recent commits and releases extending original benchmark capabilities.