Published on March 12, 2026

MolmoBot Training with Synthetic Data and Simulation for Robotics

MolmoBot: The Robot That Never Existed in the Real World – Yet Already Knows How to Work in It

Researchers from Ai2 have taught a robot to manipulate objects in the real world without showing it a single real-life scene during the training process.

Research 4 – 6 minutes min read
Event Source: Ai2 4 – 6 minutes min read

Training robots is an expensive business. Not in terms of «buying the hardware», but rather what happens before the robot actually learns to do anything. You need people to operate it manually, demonstrating the desired behavior over and over again. It requires hundreds of recording hours, dozens of sites, and a coordinated infrastructure. Open X-Embodiment – one of the largest open datasets of its kind – was compiled by 21 organizations and contains over a million real-world trajectories. DROID – another well-known dataset – consists of 350 hours of teleoperation collected across 13 institutions. This is a monumental task that remains the primary bottleneck for most labs.

That is why the idea of training a robot entirely in simulation – without a single real-world demonstration – looks both enticing and risky. Enticing, because simulation is cheap, scalable, and reproducible. Risky, because the real world differs from the virtual one, and this «sim-to-real gap» has traditionally been seen as one of the main hurdles.

Training Robots Using Diverse Virtual Environments and Simulation

Virtual Experience – Real Results

The Ai2 Research Institute decided to see if this gap could be bridged not through more realistic simulations, but through sheer diversity. The idea is this: if you show the model enough varied virtual scenes – different objects, lighting, camera angles, textures, and physical conditions – it will learn to generalize and transfer that experience into reality.

On March 11, 2026, Ai2 introduced MolmoBot – a suite of models for controlling robotic manipulators trained exclusively on synthetic data. No real-world teleoperation. No fine-tuning on real scenes. Just simulation – and then straight to a real robot.

The results proved unexpectedly compelling. On tasks like «pick up an object and place it in the right spot», the top model in the suite outperformed π0.5 – a system from Physical Intelligence trained on vast amounts of real-world data. Notably, MolmoBot had never seen these objects or scenes before – not in simulation, and certainly not in reality.

MolmoBot Capabilities and Robotic Manipulation Tasks

What MolmoBot Can Do 🤖

The suite covers several types of tasks:

  • picking up objects and moving them across a table;
  • interacting with articulated parts: drawers, cabinets, microwaves;
  • opening doors – including the approach, grabbing the handle, and moving through the full range of motion.

The robot can be controlled via words or by pointing to a spot – for example, «pick up», «put down», or «close.» All of this works across two different platforms: the Franka FR3 stationary manipulator and the Rainbow Robotics RB-Y1 mobile robot.

Simply put, this isn't a niche system built for one task and one robot. It is an attempt to create something more universal and keep it open-source.

Impact of Synthetic Data on Robotics Infrastructure and Scaling

Why This Matters More Than It Seems

Most modern systems that utilize simulation use it as a supplement to real-world data. Simulation helps, but real-world demonstrations remain the core. MolmoBot removes that layer entirely.

For the industry, this shifts the very nature of the «bottleneck.» Currently, the main constraint is data collection: you need people, robots, space, and time. If simulation works as the sole source of training, the critical factor is no longer collection, but the design of virtual environments. And that is a task that can be scaled using computation and open tools – without an army of operators.

This is especially vital for academic labs. Many simply cannot afford the teleoperation infrastructure or a partnership on the scale of Open X-Embodiment. MolmoBot, along with the open MolmoSpaces ecosystem – a set of tools for generating synthetic data – potentially makes manipulation robotics more accessible.

Limitations and Future Challenges of Simulation Only Robot Training

A Fair Assessment

It is important to understand that MolmoBot is not a claim to the ultimate solution for the «robot problem.» It is a hypothesis test: can simulation-only training work effectively for manipulation? The answer – at least for the tasks tested – seems to be yes.

However, many open questions remain. How will the system behave in more complex, chaotic environments? How will it handle tasks requiring fine tactile feedback, which simulations replicate inaccurately? Where exactly does it break, and what is needed to fix it?

The authors themselves state they want to see where the model fails. This is exactly why they have released not just the models, but the entire tech stack: data, generation pipelines, training code, and the technical report. This is unusual for robotics, where most heavy-duty systems remain behind closed doors.

In short: MolmoBot is an argument that synthetic data can become the foundation, rather than just a supplement, in robot training. For now, it is just one experiment, albeit a convincing one. But the direction it sets looks like one of the most realistic paths toward making robots accessible to more than just giant corporations.

Original Title: MolmoBot: Training robot manipulation entirely in simulation
Publication Date: Mar 11, 2026
Ai2 allenai.org A U.S.-based research institute developing language models and AI systems for science and education.
Previous Article SQL as a Language for 'Talking' with AI: What the Hologres and Model Studio Integration Offers Next Article Lightmatter Joins the XPO MSA Industry Alliance: What This Means for AI Infrastructure

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.6 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.6 Anthropic
2.
Gemini 3 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 3 Pro Google DeepMind
3.
Gemini 3 Flash Preview Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 3 Flash Preview Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Want to dive deeper into the world
of neuro-creativity?

Be the first to learn about new books, articles, and AI experiments
on our Telegram channel!

Subscribe