Getting Started

How to get started with Terminal Bench 2.0

1

Install Terminal-Bench

Run `uv tool install terminal-bench` or `pip install terminal-bench` to install the package.

2

Run Evaluations

Use the CLI commands `tb` or `tb run` to execute benchmark tasks and evaluate AI agents.

3

Configure Custom Docker Images

Set `use_prebuilt_image=false` in CLI commands or Python evaluation scripts to use custom Docker images.

4

View Leaderboard

Access the public leaderboard at https://www.tbench.ai/leaderboard/terminal-bench/2.0 to compare agent performance.

5

Contribute Tasks or Adapters

Follow documentation to add new tasks or adapters by placing files in the tasks folder and submitting a pull request.