COR Brief
AI ToolsCode & DevelopmentLive Codebench
Code & Development

Live Codebench

LiveCodeBench is an open-source benchmark designed to evaluate large language models (LLMs) on coding tasks derived from competitive programming contests. It continuously collects problems from platforms such as LeetCode, AtCoder, and CodeForces, ensuring that the problems used for evaluation are released after the model's training cutoff date to prevent data contamination. The benchmark includes over 1,000 problems spanning easy to hard difficulty levels as of its latest release (v6). LiveCodeBench assesses multiple aspects of coding capabilities including code generation, self-repair, code execution, and test output prediction, using execution-based accuracy metrics with hidden test cases for functional correctness.

Updated Dec 17, 2025open-source

LiveCodeBench provides contamination-free, time-annotated evaluation of LLMs on competitive programming problems across multiple coding scenarios.

Pricing
open-source
Category
Code & Development
Company
Interactive PresentationOpen Fullscreen ↗
01
Automatically gathers new coding problems from live contests on LeetCode, AtCoder, and CodeForces to maintain an up-to-date benchmark.
02
Annotates problems with release dates to enable evaluation on data unseen during model training, supporting contamination-free benchmarking.
03
Supports code generation, self-repair, code execution, and test output prediction to comprehensively assess LLM coding capabilities.
04
Uses hidden test cases to measure functional correctness of generated code through actual code execution.
05
Provides a reproducible evaluation framework and a leaderboard to compare LLM performance across difficulty levels.

Benchmarking LLM Coding Performance

Researchers and developers can evaluate the coding abilities of large language models on recent competitive programming problems that the models have not seen during training.

Testing Code Generation and Repair

Use LiveCodeBench to assess not only code generation but also the model's ability to self-repair code and predict test outputs.

1
Clone the Repository
Run git clone https://github.com/LiveCodeBench/LiveCodeBench.git and navigate into the directory with cd LiveCodeBench.
2
Install Dependencies
Install required dependencies using the uv command to verify installation.
3
Download Dataset Release
Download a dataset release such as release_v6 which contains 1055 problems.
4
Run Evaluations
Use provided scripts to run evaluations on supported LLMs for scenarios like code generation.
5
View Leaderboard
Submit results and view the leaderboard on the official website to compare model performance.
📊

Strategic Context for Live Codebench

Get weekly analysis on market dynamics, competitive positioning, and implementation ROI frameworks with AI Intelligence briefings.

Try Intelligence Free →
7 days free · No credit card
Pricing
Model: open-source

LiveCodeBench is free and open-source with no paid plans.

Assessment
Strengths
  • Collects problems shortly after contests to avoid training data contamination.
  • Includes 1055 problems across difficulty levels in the latest release.
  • Evaluates multiple coding capabilities beyond code generation.
  • Provides time-annotated problems for testing model generalization.
  • Open-source with a reproducible evaluation toolkit.
Limitations
  • Official repository had bugs affecting scores by up to 50%, fixed via community pull requests.
  • Limited to Python solutions and competitive programming problems.
  • Relies on external contest platforms for problem sourcing.