MMLURO

• A comprehensive, reasoning-heavy benchmark suite designed to evaluate large language models' multi-task understanding and reasoning capabilities

• Category: AI Benchmarking & Evaluation
Slide 1 of 12