1
Clone Repository
Run `git clone https://github.com/sierra-research/tau2-bench && cd tau2-bench` to download the source code.
2
Set Up Python Environment
Create and activate a Python 3.10+ virtual environment (optional but recommended).
3
Install Dependencies
Install required packages as specified in the repository setup instructions.
4
View Domain Policies and API Docs
Use the command `tau2 env <domain>` and visit `http://127.0.0.1:8004/redoc` to access API documentation.
5
Run Evaluations
Execute provided scripts to run specific tasks by ID or evaluate agent performance.