Published on March 24, 2026

AMD support for RL training on GPUs

AMD Opens Access to Powerful RL Training on Its GPUs: What This Means for Developers

AMD has adapted the Miles framework for large-scale reinforcement learning on Instinct GPUs – now it works without NVIDIA hardware.

Infrastructure / Technical context 4 – 5 minutes min read

Event Source: LMSYS ORG 4 – 5 minutes min read

Reinforcement Learning (RL) is one of the key methods that make modern language models smarter and more useful after their initial training. This 'fine-tuning' stage is what ensures a model doesn't just generate text, but does so intelligently: following instructions, avoiding incorrect answers, and solving problems step-by-step. Simply put, RL is what turns a 'knowledgeable' model into a 'useful' one.

Until recently, the infrastructure for this type of training was almost entirely tailored for NVIDIA GPUs. The Miles framework – one of the most advanced tools for large-scale RL training – was no exception. The LMSYS team, in collaboration with AMD, has changed this: Miles now officially supports AMD Instinct series GPUs running on the ROCm platform.

What is Miles framework and its importance for RL

What is Miles and Why is it Important

Miles is a system for the so-called post-training of already prepared language models using reinforcement learning. This is the exact approach used to create 'reasoning' models – those that analyze a task step-by-step before providing an answer.

The main feature of Miles is its ability to work in a distributed mode: training can run simultaneously on multiple GPUs spread across several servers. This is critically important when working with large models that simply do not fit on a single accelerator.

Until now, this level of scaling was available primarily on NVIDIA hardware. AMD's support changes this situation.

Miles performance on AMD Instinct GPUs

Technically – Almost No Losses

Adapting to ROCm required significant engineering work. AMD's platform is structured differently than NVIDIA's CUDA, and not all code can be ported automatically. The team had to handle compatibility at the low-level operations, debug interactions between GPUs on different nodes, and ensure that performance did not decrease.

The result was encouraging: Miles on AMD Instinct demonstrates performance comparable to NVIDIA for large-scale RL training. This isn't a case of 'it works, but it's slower' – this is full-fledged support.

To understand the scale: tests were conducted on models like DeepSeek-R1 – one of the most resource-intensive open models available today. These are the very models that actively use RL in training and require the coordinated work of dozens of GPUs simultaneously.

Impact of AMD's RL training support

Why AMD Needs This – and Why Everyone Else Does Too

AMD is consistently investing in the development of its ecosystem for AI computing. The release of ROCm 7.1 brought official support for the MI350X and MI355X, and version ROCm 7.2.0 significantly improved performance on inference tasks for large models. In parallel, AMD open-sourced the ROCprof Trace Decoder – a tool for in-depth analysis of GPU performance that was previously closed-source.

Support for Miles is part of the same logic. Previously, a developer wanting to train a model using RL was forced to work exclusively on NVIDIA; now, they have a real alternative.

This is important not just for large companies. Research groups, universities, and small teams often use the hardware that is available, not necessarily the hardware they want. Expanding compatibility means that the barrier to entry for serious RL training is lowered.

Open source strategy for AMD AI ecosystem

Openness as a Strategy

It's also significant that all of this is happening within an open ecosystem. ROCm is an open platform, Miles is being developed by the LMSYS team as a research project, and AMD itself is actively publishing test results and sharing code. For example, the ATOM engine, optimized for inference on the MI355X, was made publicly available on GitHub.

This approach – open source code, open benchmarks, open tools – is gradually changing the perception of AMD in the community. For a long time, NVIDIA was seen as the only viable choice for serious AI tasks, largely due to the maturity of its ecosystem. Now, that gap is closing.

Practical changes with AMD RL support

What This Changes in Practice

In short: language model developers now have another genuinely working option for large-scale reinforcement learning – and this option is not dependent on NVIDIA.

This doesn't mean that everyone will immediately switch to AMD. NVIDIA's ecosystem is still deeper, with more tools and significantly more community experience. But the existence of a working alternative is valuable in itself: it creates competition, stimulates development, and gives freedom of choice to those who need it.

Miles on ROCm is not an announcement of a future possibility; it is a working tool available today. And that is, perhaps, the most important thing.

#event #analysis #neural networks #machine learning #engineering #infrastructure #open technologies #open language models #model training optimization

Link to Original: https://lmsys.org/blog/2026-03-17-rocm-miles-rl-amd

Original Title: ROCm Support for Miles: Large-Scale RL Post-Training on AMD Instinct™ GPUs

Publication Date: Mar 17, 2026

LMSYS ORG lmsys.org A U.S.-based non-profit research organization studying scalable language models and distributed training systems.

Previous Article GitHub Taught Its Security Scanner to Understand Code Like a Human Next Article Reinforcement Learning: Expensive in Name Only

AMD support for RL training on GPUs

What is Miles framework and its importance for RL

Miles performance on AMD Instinct GPUs

Impact of AMD's RL training support

Open source strategy for AMD AI ecosystem

Practical changes with AMD RL support

Related Publications

AMD and Artificial Intelligence: How the Company is Catching Up to Market Leaders in Inference Performance

NVIDIA GTC 2026: Highlights from the Year's Biggest AI Conference

Lightmatter Joins the XPO MSA Industry Alliance: What This Means for AI Infrastructure

From Source to Analysis

Neural Networks Involved in the Process

1. Analyzing the Original Publication and Writing the Text

2. step.translate-en.title

3. Text Review and Editing

4. Preparing the Illustration Description

5. Creating the Illustration