Published February 13, 2026

MiniMax Forge: Open Platform for Training AI Agents on Clusters

MiniMax Introduces Forge: A Platform for Training AI Agents on Powerful Computing Clusters

Chinese company MiniMax has released Forge, an open platform designed for training agents using reinforcement learning on large-scale GPU clusters.

Infrastructure
Event Source: MiniMax Reading Time: 4 – 5 minutes

Chinese company MiniMax, known for its developments in generative AI, has released Forge – an open platform for training intelligent agents. Simply put, it's a tool that helps teach models not just to generate text, but to perform tasks, including reasoning, planning actions, and interacting with their environment.

Forge is built around the idea of reinforcement learning – an approach where a model learns by trial and error, receiving feedback for its actions. This is the same principle used to train AlphaGo or ChatGPT in dialogue mode. The key difference here is the emphasis on making this process scalable: running it on hundreds or thousands of graphics processing units (GPUs) simultaneously.

Why a New Platform for AI Agent Training

Why Another Platform?

Training agents is not the same as training a language model in the traditional sense. An agent must not only understand text but also make decisions: which function to call, what request to send, or how to interpret the result. This requires a different approach to training.

Existing solutions are either tailored for small-scale experiments or require significant modifications to run on large clusters. According to its developers, Forge was specifically created to allow agents to be trained on thousands of GPUs without the need to rewrite code or reinvent the wheel for task distribution.

The platform supports popular reinforcement learning algorithms and allows for the integration of custom methods. Its code is open, giving researchers and developers the ability to adapt the system to their specific needs.

Forge Platform: Algorithm and Architecture Explained

What's Inside: Algorithm and Architecture

Along with the platform, MiniMax also released its own training algorithm, also named Forge. It is based on a method similar to PPO – one of the standard approaches in reinforcement learning – but with enhancements that, according to the team, make it more stable and efficient when working with language models.

The core idea is to divide the process into several stages: data collection (the model tries different actions), results evaluation (how well each action performed), and model weight updates. All of this happens in parallel across multiple devices, which can speed up the process tenfold.

Forge supports various types of tasks, from simple text-based ones to complex scenarios where the agent interacts with external systems, databases, or APIs. Developers can define their own reward functions – in other words, describe what constitutes success and what is considered an error.

Forge Open Source Availability

Open Source and Availability

Forge's code has been made publicly available. This means anyone can download the platform, run it on their own servers, and start experimenting. MiniMax has also provided documentation and usage examples, which lowers the barrier to entry.

Openness is a key point. In the field of agent training, there are no established standards yet, and many teams develop their own solutions from scratch. Forge could become a common foundation that allows them to save time and focus on the algorithms themselves, rather than on the infrastructure.

Moreover, the platform is not tied to specific MiniMax models. It can be used with any language models that support the required interaction format.

Who Can Use the Forge Platform

Who Is This For?

First and foremost, it's for research teams and companies developing agents for real-world tasks: process automation, document handling, and user interaction through complex scenarios.

Forge can also be useful for those studying reinforcement learning as it applies to language models. This is an active area of research, and having ready-made infrastructure simplifies conducting experiments.

The platform may also be helpful for teams looking to train models for specific tasks that require not just text generation, but executing a sequence of actions with result verification.

Future of AI Agent Training with Forge

What's Next?

The release of Forge is another step towards agents becoming a practical tool rather than an experimental technology. For now, training such systems remains a complex and resource-intensive process, and not all teams can afford to allocate thousands of GPUs for experiments.

An open platform lowers this barrier. But questions remain: How well will Forge perform with different types of tasks? How will it handle tasks where feedback is not obvious or is delayed over time? And most importantly, will the community truly adopt it as a common foundation, or will each team continue to build its own solutions anyway?

Time and practical use will provide the answers to these questions. For now, developers have another tool that's worth trying.

Original Title: Forge: Scalable Agent RL Framework and Algorithm
Publication Date: Feb 12, 2026
MiniMax www.minimax.io A Chinese AI company developing large language and multimodal models for dialogue and content generation.
Previous Article How AMD and Qwen Optimized MI300X GPUs for Peak Performance Next Article AI Agents Write CUDA Kernels: GPT and Claude Learn to Generate GPU Code

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.5 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.5 Anthropic
2.
Gemini 2.5 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 2.5 Pro Google DeepMind
3.
Gemini 2.5 Flash Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 2.5 Flash Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

Want to dive deeper into the world
of neuro-creativity?

Be the first to learn about new books, articles, and AI experiments
on our Telegram channel!

Subscribe