Getting Started - swebench

1

Visit https://www.swebench.com to explore available datasets and leaderboards.

2

Start with SWE-bench Lite (300 instances) for initial evaluation to reduce compute requirements.

3

Use the Harness API to configure Docker environments, run tests, and generate patches.

4

Submit your predictions.json file with model-generated patches to the leaderboard to obtain % Resolved scores.

5

Contact support@swebench.com for custom datasets or to contribute to the benchmark.