Published February 12, 2026

MolmoSpaces – An Open Platform for Teaching Robots to Interact With the Real World

The Allen Institute for AI has unveiled MolmoSpaces – a toolkit for developing AI capable of controlling robots and functioning in physical space.

Research
Event Source: Ai2 Reading Time: 3 – 4 minutes

The Allen Institute for AI has released MolmoSpaces – an open platform for what the industry calls embodied AI, or «embodied artificial intelligence». Simply put, this is AI that doesn't just generate text or images but is capable of controlling physical devices: robots, manipulators, and drones.

What Is Embodied AI and How It Differs from Standard AI Models

What Is Embodied AI and Why Is It a Separate Category

Ordinary language models work in the digital space. They process text, answer questions, and write code. But if you need a model to control a robot – say, pick up a cup from a table or cross a room – a completely different skillset is required.

A robot must understand images from a camera, estimate distance to objects, plan movements, and adjust them in real time. This isn't just a question of increasing the number of model parameters, but of architecture, data, and training methods.

Previously, highly specialized systems were used for such tasks: one model recognized objects, another planned the trajectory, and a third managed motor skills. MolmoSpaces offers a different approach: using a multimodal model that is simultaneously capable of both seeing and acting.

Key Features and Components of the MolmoSpaces Platform

What Is Included in MolmoSpaces

The platform includes several components. First, there is the Molmo model itself: it already knows how to work with images and text, and now it has been adapted for controlling robots.

Second is the training dataset. For a model to learn to act in the physical world, it needs examples: video from robot cameras, recordings of movement trajectories, and action annotations. The Allen Institute has collected such data and made it publicly available.

Third is the testing infrastructure. Developers can check their models in simulators and then transfer them to real robots. This lowers the barrier to entry: you don't need to buy expensive equipment right away to start experiments.

The Role of Open Source Development in Embodied Artificial Intelligence

Why Openness Matters in This Field

Embodied AI is an expensive field. It requires robots, sensors, and significant computing power. Most research is conducted by large companies, and their results are rarely published in full. This slows down progress: every team is forced to solve the same basic problems from scratch.

MolmoSpaces is betting on a different development model. All components – the model, data, and code – are available for use and modification. This allows researchers and startups to experiment without starting from square one.

For the industry, this could mean accelerated progress. If more teams can work on embodied AI, more solutions will appear for warehouse logistics, home automation, and medical robotics. So far, these areas are developing slowly precisely because of high barriers to entry.

Future Outlook for MolmoSpaces and Robot Learning Systems

What's Next

The project has just launched, and it is too early to judge how effectively the model handles complex tasks. Embodied AI is not just technology but also engineering: even a high-quality model can glitch if the robot is poorly calibrated or if the data was collected in conditions that differ from reality.

But the very fact that an open platform has appeared is already an important step. Previously, most tools for embodied AI were either closed or too narrowly specialized. MolmoSpaces aims to create an ecosystem where different teams can work on a common task.

If this approach pays off, in a few years we will see robots that truly understand the surrounding world not through rigid algorithms, but thanks to learning from examples. For now, this is more of a research base than a ready-made solution, but that is exactly how serious technological changes usually begin.

Original Title: MolmoSpaces, an open ecosystem for embodied AI
Publication Date: Feb 12, 2026
Ai2 allenai.org A U.S.-based research institute developing language models and AI systems for science and education.
Previous Article Human in the Loop: Why Sales AI Needs a Human Touch Next Article When AI Becomes Your Personal Shopper: What Is Agentic Commerce?

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.5 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.5 Anthropic
2.
Gemini 3 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 3 Pro Google DeepMind
3.
Gemini 3 Flash Preview Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 3 Flash Preview Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

Don’t miss a single experiment!

Subscribe to our Telegram channel —
we regularly post announcements of new books, articles, and interviews.

Subscribe