Published on March 19, 2026

Как оценить прогресс ИИ: фреймворк DeepMind для AGI

How to Measure Our Proximity to True AI: Google DeepMind Proposes a New Framework

Google DeepMind has introduced a cognitive framework for assessing progress toward artificial general intelligence (AGI) and launched a Kaggle hackathon to develop relevant benchmarks.

Research 4 – 6 minutes min read
Event Source: Google DeepMind 4 – 6 minutes min read

One of the biggest questions in the field of artificial intelligence sounds simple, yet remains difficult to answer: how do we know we are getting closer to true AI – the kind that can think as flexibly as a human? Google DeepMind has decided to tackle this question systematically by introducing its own conceptual framework for measuring progress toward AGI, or so-called “artificial general intelligence.”

AGI: что это и чем отличается от нынешнего ИИ

AGI Isn't Just a Smart Program

Before we discuss measurements, it's worth clarifying what AGI is. In short, it's a hypothetical AI capable of solving any intellectual task as well as – or better than – a human. Not just playing chess or writing text, but truly any task, including those it has never encountered before.

Today's systems, even the most powerful language models, can do a lot – but they operate within fairly rigid frameworks. They excel at the tasks they were trained on but often falter where a human would easily adapt. So, the gap between “smart AI” and “true general intelligence” is still huge. And that's precisely why the question of measuring this journey is becoming more and more relevant.

Измерение AGI: почему это сложная задача

Measuring Something That Doesn't Yet Exist Is No Easy Task

The problem is that we still don't have a universally accepted way to assess how close any given system is to AGI. Existing tests and benchmarks – that is, sets of tasks used to compare models – typically check for something specific: how well a model translates text, solves math problems, or writes code. But none of them provide a holistic picture.

This is where DeepMind is taking a step forward. The company is proposing a cognitive framework – a set of principles and categories that describe intelligence not by narrow skills, but by more fundamental cognitive abilities. Simply put, they want to measure not “what the model can do,” but “how it thinks and how flexibly.”

Принципы оценки интеллекта от DeepMind

What Exactly Is DeepMind Proposing?

The approach is based on the idea that intelligence can be broken down into several key cognitive dimensions. This isn't just a list of skills – it's an attempt to describe the very structure of thought. Under the proposed system, the evaluation looks not only at whether the AI completed the task, but also how: did it use generalization, abstraction, reasoning, learning by analogy, and so on.

This approach allows progress to be tracked not as “leaps” from one high-profile result to another, but as a gradual movement across multiple dimensions simultaneously. This is closer to how scientists assess the development of intelligence in humans or animals – through a set of cognitive abilities rather than a single test.

Как хакатон на Kaggle поможет в развитии методики

A Hackathon to Test the Theory in Practice

Along with the publication of the framework, DeepMind has launched a hackathon on the Kaggle platform. It's a competition for developers and researchers, where participants are asked to create specific evaluation tasks – benchmarks that align with the logic of the proposed conceptual system.

This is an interesting move. Instead of coming up with all the necessary tests on its own, DeepMind is effectively opening up the task to the wider community. A hackathon is a way to quickly gather a large number of ideas, select the best ones, and turn them into functional evaluation tools. In essence, the company is saying, “Here's the concept – help us fill it with concrete measurements.”

Kaggle is a popular competition platform among machine learning specialists. Its audience numbers in the hundreds of thousands of developers and researchers worldwide, so the initiative's reach is considerable.

Почему новая система оценки ИИ важна для всей индустрии

Why This Matters to Everyone, Not Just DeepMind

At first glance, this might seem like an internal project of a major tech company. But in reality, the issue of AI evaluation standards affects everyone who works with or depends on these systems.

Without common criteria for progress, it's hard to compare different systems, hard to distinguish real achievements from marketing hype, and extremely difficult to explain to the public what is actually happening. Right now, each lab largely evaluates itself using the benchmarks where its own models perform best. This is not an ideal situation.

If DeepMind succeeds in proposing a sufficiently convincing framework – and getting the broader community involved in its development – it could be a step toward fairer and more comparable evaluations across the entire industry.

Перспективы и сложности новой концепции AGI

What Still Remains an Open Question

Of course, such initiatives are rarely accepted unanimously. The very concept of AGI remains debatable: different researchers understand different things by it, and there is still no single definition. This means that any framework for its “measurement” will be based on specific assumptions – which can be challenged.

Furthermore, there is a risk that the new tests will ultimately prove to be just as narrow as the previous ones – just more elegantly packaged. The history of AI benchmarks is full of examples where models quickly “saturated” a test without demonstrating any real generalized intelligence.

But the very fact that one of the world's leading AI labs has decided to approach the issue systematically and openly is already significant. We'll see what comes out of the hackathon and how other industry players react to the proposed coordinate system.

Original Title: Measuring progress toward AGI: A cognitive framework
Publication Date: Mar 17, 2026
Google DeepMind deepmind.google An international research lab of Google focused on fundamental and applied AI development.
Previous Article Google Invests in Open Source Security and Applies AI Next Article Databricks Introduces New Embedding Model for Data Retrieval and Processing in AI Agents

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.6 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.6 Anthropic
2.
Gemini 2.5 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 2.5 Pro Google DeepMind
3.
Gemini 2.5 Flash Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 2.5 Flash Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Want to know about new
experiments first?

Subscribe to our Telegram channel — we share all the latest
and exciting updates from NeuraBooks.

Subscribe