Published on March 14, 2026

RAFFLES: How to Teach AI to Identify Its Reasoning Errors

RAFFLES: How to Teach AI to Explain Its Own Mistakes

Researchers have proposed a new approach to evaluating the quality of AI responses, which, instead of a simple «yes/no», attempts to understand the reasons behind errors.

Research / Technical context 4 – 5 minutes min read
Event Source: Capital One 4 – 5 minutes min read

When a language model gives an incorrect answer, the first question developers ask is why. Not «what went wrong», but specifically why: which part of the reasoning broke, at what point did the model take a wrong turn. In practice, this turns out to be a surprisingly difficult task, and it's precisely the one the RAFFLES system aims to solve.

Evaluating AI: A Complex Task Beyond Simple Scoring

Evaluating AI – A Task No Simpler Than AI Itself

The standard approach to evaluating a model's quality looks something like this: take the answer, compare it to a reference, and assign a score. This works as long as we're dealing with simple, unambiguous tasks. But when a model is solving something multi-step – analyzing a document, building a line of reasoning, drawing a conclusion – this approach starts to fail. It doesn't explain where exactly the error occurred.

Simply put: knowing the answer is wrong is useful. Knowing at which step the reasoning went astray is significantly more useful.

RAFFLES is an evaluation architecture that approaches the problem differently. Instead of just delivering a verdict, it tries to attribute the error – that is, to determine exactly where and why something went wrong. The evaluation process itself is built on reasoning and iterative refinement.

Reasoning in AI Evaluation: Understanding Step-by-Step Logic

What Does «Reasoning» Mean in the Context of Evaluation?

The idea is that the evaluator – in this case, also a language model – doesn't just look at the final result but breaks down the answer step-by-step. It's as if it asks itself questions like: «Was this conclusion drawn correctly? Where did this statement come from? Does this align with the original text?»

This is similar to how a teacher grades an assignment: they care not only about the final number but also about the solution process. An error at the beginning of the reasoning can lead to a plausible-sounding but incorrect conclusion – and conversely, a correct answer might be reached by chance through a flawed chain of steps.

RAFFLES tries to catch precisely this: not just an error in the output, but the breaking point in the logic.

Iterative Refinement: Enhancing AI Evaluation Accuracy

Iterative Refinement: When the First Look Isn't Final

The second key element of the approach is iteration. The evaluation doesn't happen in a single pass but over several stages. The evaluator model forms a preliminary conclusion, then revisits, re-examines, and refines it.

This is important for the same reason people write drafts: the first judgment isn't always the most accurate. This is especially true for complex, multi-part answers where the sequence of details matters.

This approach allows for more than just a mechanical comparison of the answer to a reference; it leads to a more balanced and substantiated conclusion, specifying the concrete reasons for any discrepancies.

Why Error Attribution is Crucial for AI in Practice

Why Is This Needed in Practice?

If you work with language models in any applied context – be it automated document processing, customer support, or something else – sooner or later you'll need to understand how well the model is performing. And what's important here isn't just the percentage of correct answers, but an understanding of error patterns: Does the model systematically misinterpret the prompt? Does it lose context in long texts? Does it draw false conclusions from correct premises?

Without tools that can attribute errors, this understanding remains intuitive. RAFFLES offers a way to make it more systematic.

The work was presented at the EACL conference – one of the key scientific venues in the field of natural language processing. This indicates that the approach has undergone academic peer review and wasn't just published in a blog post.

RAFFLES: Unanswered Questions and Future Directions in AI Evaluation

What Remains Unanswered

RAFFLES is an architectural approach, a research paper. It is not a ready-made product that you can download and apply to any task. How well it generalizes to different types of tasks and different models is a question that will require further investigation.

Furthermore, when one model is used to evaluate another, a legitimate question arises about the reliability of the evaluator itself. If it has its own blind spots or systematic biases, this will inevitably affect the result. This is a general problem with the «model evaluates model» approach, and RAFFLES is no exception.

Nevertheless, the principle itself – evaluation through reasoning with error attribution – sounds like a step toward more meaningful diagnostics for language models. This is especially relevant now, as models are increasingly used in tasks where the cost of an error is significant.

Original Title: RAFFLES: reasoning-based attribution of faults
Publication Date: Mar 24, 2026
Capital One www.capitalone.com A U.S.-based financial technology corporation applying artificial intelligence and machine learning to banking services, data analytics, and financial process automation.
Previous Article The AI Agent Within: A New Face for the Familiar PC Next Article Sber Now Able to Verify if AI Truly Can Peer Into the Future

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.6 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.6 Anthropic
2.
Gemini 2.5 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 2.5 Pro Google DeepMind
3.
Gemini 2.5 Flash Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 2.5 Flash Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Want to know about new
experiments first?

Subscribe to our Telegram channel — we share all the latest
and exciting updates from NeuraBooks.

Subscribe