Published on March 14, 2026

RAFFLES: How to Teach AI to Identify Its Reasoning Errors

RAFFLES: How to Teach AI to Explain Its Own Mistakes

Researchers have proposed a new approach to evaluating the quality of AI responses, which, instead of a simple «yes/no», attempts to understand the reasons behind errors.

Research / Technical context 4 – 5 minutes min read

Event Source: Capital One 4 – 5 minutes min read

When a language model gives an incorrect answer, the first question developers ask is why. Not «what went wrong», but specifically why: which part of the reasoning broke, at what point did the model take a wrong turn. In practice, this turns out to be a surprisingly difficult task, and it's precisely the one the RAFFLES system aims to solve.

Evaluating AI: A Complex Task Beyond Simple Scoring

Evaluating AI – A Task No Simpler Than AI Itself

The standard approach to evaluating a model's quality looks something like this: take the answer, compare it to a reference, and assign a score. This works as long as we're dealing with simple, unambiguous tasks. But when a model is solving something multi-step – analyzing a document, building a line of reasoning, drawing a conclusion – this approach starts to fail. It doesn't explain where exactly the error occurred.

Simply put: knowing the answer is wrong is useful. Knowing at which step the reasoning went astray is significantly more useful.

RAFFLES is an evaluation architecture that approaches the problem differently. Instead of just delivering a verdict, it tries to attribute the error – that is, to determine exactly where and why something went wrong. The evaluation process itself is built on reasoning and iterative refinement.

Reasoning in AI Evaluation: Understanding Step-by-Step Logic

What Does «Reasoning» Mean in the Context of Evaluation?

The idea is that the evaluator – in this case, also a language model – doesn't just look at the final result but breaks down the answer step-by-step. It's as if it asks itself questions like: «Was this conclusion drawn correctly? Where did this statement come from? Does this align with the original text?»

This is similar to how a teacher grades an assignment: they care not only about the final number but also about the solution process. An error at the beginning of the reasoning can lead to a plausible-sounding but incorrect conclusion – and conversely, a correct answer might be reached by chance through a flawed chain of steps.

RAFFLES tries to catch precisely this: not just an error in the output, but the breaking point in the logic.

Iterative Refinement: Enhancing AI Evaluation Accuracy

Iterative Refinement: When the First Look Isn't Final

The second key element of the approach is iteration. The evaluation doesn't happen in a single pass but over several stages. The evaluator model forms a preliminary conclusion, then revisits, re-examines, and refines it.

This is important for the same reason people write drafts: the first judgment isn't always the most accurate. This is especially true for complex, multi-part answers where the sequence of details matters.

This approach allows for more than just a mechanical comparison of the answer to a reference; it leads to a more balanced and substantiated conclusion, specifying the concrete reasons for any discrepancies.

Why Error Attribution is Crucial for AI in Practice

Why Is This Needed in Practice?

If you work with language models in any applied context – be it automated document processing, customer support, or something else – sooner or later you'll need to understand how well the model is performing. And what's important here isn't just the percentage of correct answers, but an understanding of error patterns: Does the model systematically misinterpret the prompt? Does it lose context in long texts? Does it draw false conclusions from correct premises?

Without tools that can attribute errors, this understanding remains intuitive. RAFFLES offers a way to make it more systematic.

The work was presented at the EACL conference – one of the key scientific venues in the field of natural language processing. This indicates that the approach has undergone academic peer review and wasn't just published in a blog post.

RAFFLES: Unanswered Questions and Future Directions in AI Evaluation

What Remains Unanswered

RAFFLES is an architectural approach, a research paper. It is not a ready-made product that you can download and apply to any task. How well it generalizes to different types of tasks and different models is a question that will require further investigation.

Furthermore, when one model is used to evaluate another, a legitimate question arises about the reliability of the evaluator itself. If it has its own blind spots or systematic biases, this will inevitably affect the result. This is a general problem with the «model evaluates model» approach, and RAFFLES is no exception.

Nevertheless, the principle itself – evaluation through reasoning with error attribution – sounds like a step toward more meaningful diagnostics for language models. This is especially relevant now, as models are increasingly used in tasks where the cost of an error is significant.

#research review #methodology #neural networks #ai development #ai training #ai linguistics #transparency #ai reliability #error management

Link to Original: https://www.capitalone.com/site/tech/publications/raffles-reasoning-based-attribution-of-faults/

Original Title: RAFFLES: reasoning-based attribution of faults

Publication Date: Mar 24, 2026

Capital One www.capitalone.com A U.S.-based financial technology corporation applying artificial intelligence and machine learning to banking services, data analytics, and financial process automation.

Previous Article The AI Agent Within: A New Face for the Familiar PC Next Article Sber Now Able to Verify if AI Truly Can Peer Into the Future

RAFFLES: How to Teach AI to Identify Its Reasoning Errors

Evaluating AI: A Complex Task Beyond Simple Scoring

Reasoning in AI Evaluation: Understanding Step-by-Step Logic

Iterative Refinement: Enhancing AI Evaluation Accuracy

Why Error Attribution is Crucial for AI in Practice

RAFFLES: Unanswered Questions and Future Directions in AI Evaluation

Related Publications

Model Uncertainty as a Signal: What Happens When AI Encounters the Unknown

How to Tell if Your AI Agent is Actually Working or Just Looking Convincing

How2Everything: When Chatbot Instructions Actually Need to Work

From Source to Analysis

Neural Networks Involved in the Process

1. Analyzing the Original Publication and Writing the Text

2. step.translate-en.title

3. Text Review and Editing

4. Preparing the Illustration Description

5. Creating the Illustration