🚀
Terminal Bench 2.0
Terminal-Bench 2.0 is an open-source benchmark for evaluating AI agents on terminal-based software engineering tasks using containerized environments.
Agents & Automation
Terminal-Bench 2.0 is an open-source benchmark for evaluating AI agents on terminal-based software engineering tasks using containerized environments.