Use Cases

Real-world applications

Evaluating AI Models for Bug Fixing

Researchers can use SWE-bench to test how well their large language models generate patches that fix real GitHub issues.

Training AI Agents on Software Engineering Tasks

Developers can leverage the SWE-smith dataset and SWE-bench subsets to train models on realistic issue resolution scenarios.