Use Cases

Real-world applications

Benchmarking LLM Coding Performance

Researchers and developers can evaluate the coding abilities of large language models on recent competitive programming problems that the models have not seen during training.

Testing Code Generation and Repair

Use LiveCodeBench to assess not only code generation but also the model's ability to self-repair code and predict test outputs.