Published on March 21, 2026

Как роботов учат точным движениям: онлайн-обучение с подкреплением RLT

How Robots Learn Precise Movements: Online Reinforcement Learning from Physical Intelligence

Physical Intelligence has introduced an approach for teaching robots precise manipulations using online reinforcement learning directly during interaction with the environment.

Research 3 – 5 minutes min read
Event Source: Physical Intelligence 3 – 5 minutes min read

Teaching a robot to pick up an object from a table is a task that seems simple, but in practice, it's quite complex. This is especially true when it's not just about grasping, but about precise manipulations: inserting a connector, aligning parts, or gently applying pressure in the right spot. These are precisely the kinds of tasks that the company Physical Intelligence (pi) is working on, and it recently published the results of a new approach called RLT.

Недостатки традиционных методов обучения роботов

What's Wrong with the Traditional Approach

Most robots today are trained using the «show and repeat» principle: an operator demonstrates the desired movement, and the robot memorizes and reproduces it. This works fairly well for simple, predictable actions. But precise manipulations are another matter. They require not only the movement's trajectory but also subtle feedback: how exactly a part fits into a slot, with what force, and at what angle.

Simply put, imitation alone is not enough. The robot needs to try, make mistakes, and learn from them – in real time, in a real environment.

Суть онлайн-обучения с подкреплением

The Idea: Learning Directly in the Process

This is exactly what RLT offers – an approach based on online reinforcement learning. In short, the robot doesn't just reproduce learned movements but receives a score for each action and gradually improves its behavior based on what worked.

«Online» here means that training doesn't happen in advance on a large dataset of recorded examples, but directly during interaction with a real object. The robot tries, gets a signal, adjusts, and tries again. It's similar to how a person learns to tie shoelaces: no description can replace the practice of handling a real loop.

The word «effective» in its name is also important. Reinforcement learning is traditionally considered «expensive» in terms of time and resources: a robot needs a huge number of attempts to learn something. The pi team worked to make this process reasonably fast and not require thousands of hours of physical experiments.

Применение RLT для обучения точным задачам

What Tasks Was It Tested On

RLT was tested on tasks that specifically require precision: inserting plugs and connectors, assembling components with tight tolerances, and manipulating small parts. This was no random choice – such tasks are considered some of the most difficult for robots because even a slight deviation can lead to failure.

The results showed that the approach allows the robot to significantly improve its precision, especially in situations where pre-trained models began to fail. In other words, where «rote-learned» behavior stops working, online learning helps it to adapt.

Перспективы онлайн-обучения роботов для индустрии

Why This Matters Beyond the Lab

Robots are increasingly entering manufacturing, logistics, and domestic environments. And one of the main barriers to this expansion is precision when working with physical objects. Tightening a bolt, connecting a cable, or assembling a miniature component – all of this requires not just «roughly there», but «exactly so.»

Approaches that allow a robot to learn precise actions quickly and directly in its work environment – without needing to be reprogrammed for every new task – could potentially change where and how robots are used.

Currently, most industrial robots are designed for one specific operation and struggle to adapt to changes. If online reinforcement learning can be made reliable and scalable, it could be a step toward robots capable of adapting to new conditions on the spot.

Будущие направления развития метода

Open Questions

The research demonstrates a promising approach, but a number of questions remain unanswered for now. How well does RLT perform outside of the specific tasks it was trained on? How does the system behave if the conditions differ significantly from those in which it was trained? How quickly can the robot switch to a new type of task?

This is a normal situation for a research publication: to show that an idea works, outline its capabilities, and leave room for next steps. Physical Intelligence is clearly continuing to move toward universal robotic systems, and RLT is one piece of that larger puzzle.

Original Title: Precise Manipulation with Efficient Online RL
Publication Date: Mar 19, 2026
Physical Intelligence www.pi.website A U.S.-based research and technology company exploring physical intelligence and hybrid AI systems that combine computation with physical processes.
Previous Article Microsoft Announces Zero Trust for AI: A New Approach to AI System Security Next Article Why AI Agents Fail Without Context and What to Do About It

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.6 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.6 Anthropic
2.
Gemini 2.5 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 2.5 Pro Google DeepMind
3.
Gemini 2.5 Flash Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 2.5 Flash Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Want to dive deeper into the world
of neuro-creativity?

Be the first to learn about new books, articles, and AI experiments
on our Telegram channel!

Subscribe