AI: Events
SWE-fficiency: Evaluating Not Just an AI's Bug-Finding Ability, But the Efficiency of Its Fixes
Development
A new benchmark assesses how quickly and accurately AI agents fix code, not just identify problems – taking into account time, attempts, and real-world working conditions.