Topic #model benchmarks

AI: Events

Open-Source Model LightOnOCR-2 Outperforms Claude, GPT-5, and Others in Table Recognition

Products

LightOn has released the open-source model LightOnOCR-2, which surpasses leading commercial AI solutions in the task of extracting tables from documents.

LightOn AIwww.lighton.ai Apr 7, 2026

AI: Events

16 AI Models, 9,000+ Documents: Who Came Out on Top?

Products

A large-scale test of 16 AI models on real-world documents revealed surprising results: expensive solutions don't always outperform their more affordable counterparts.

Nanonetsnanonets.com Mar 20, 2026

AI: Events

MR3: A Model That Evaluates AI Responses in Dozens of Languages Without Predefined Rules

Technical context • Research

Researchers have introduced the MR3 model, which evaluates the quality of language model responses across multiple languages – without rigid criteria or evaluation templates.

Capital Onewww.capitalone.com Mar 16, 2026

AI: Events

Perplexity Releases Its Own Models for Searching Massive Text Datasets

Products

Perplexity has released two new models for semantic search – designed to quickly and accurately find information across billions of documents.

Perplexity AIresearch.perplexity.ai Feb 27, 2026

AI: Events

MiniMax M2.5: Open-Source Models Catch Up to Claude Sonnet

Products

Chinese company MiniMax has released M2.5, a family of open-weight models whose performance is approaching that of Claude 3.5 Sonnet.

OpenHandsopenhands.dev Feb 13, 2026

AI: Events

Hugging Face Community Evals: When the Community Decides to Test Models Itself

Development

Hugging Face has launched Community Evals – a platform where developers can independently test language models and share results without relying on closed leaderboards.

Hugging Facehuggingface.co Feb 7, 2026

AI: Events

How Chinese Open Source Handles Architectures: What Happens After DeepSeek

Research

We explore the architectural solutions developers of Chinese open-source models are choosing and why decoder-based approaches continue to dominate the ecosystem.

Hugging Facehuggingface.co Jan 28, 2026

AI: Events

AMD Launches ReasonLite-0.6B: A Compact Model for Logical Reasoning

Products

AMD has unveiled ReasonLite-0.6B, a compact language model focusing on logical reasoning, trained using a majority voting strategy and a staged approach.

AMDwww.amd.com Jan 21, 2026

AI: Events

How Cursor Improved Their AI Debugger

The Cursor team shared how they refined Bugbot – a tool for automated bug fixing – using a specialized AI-based metric.

Cursor AIcursor.com Jan 16, 2026