Published on April 3, 2026

Gemma 4 on AMD: Immediate Hardware Support

Gemma 4 on AMD: Day-and-Date Support on Release

Google has released the Gemma 4 family of open models, and AMD has provided immediate support on release day across its entire hardware spectrum, from data centers to laptops.

Infrastructure / Technical context 4 – 6 minutes min read
Event Source: AMD 4 – 6 minutes min read

When Google releases a new family of open models, the question «but what can we run this on?» arises almost immediately. With Gemma 4, AMD aimed to address this in advance: support for the entire new model lineup was available on release day. This doesn't just cover server hardware but also consumer GPUs and laptop processors.

What is Gemma 4 and Why is it Interesting?

Gemma 4 is a family of four open models from Google, varying in size and architecture. The most compact model has approximately 2 billion active parameters, while the largest has 31 billion. Some models are built using a classic «dense» architecture, while others use a «Mixture of Experts» approach. In simple terms, the model activates only the necessary portion of its «knowledge» depending on the task, which helps save computational resources.

The models are multimodal: they work with text, images, and some variants also handle audio. The context window reaches 256,000 tokens – an impressive amount, roughly equivalent to several thick novels. Claimed strengths include understanding 140 languages, handling code, recognizing text and objects in images, and voice input.

Compared to the previous generation, Gemma 3, the architecture has been redesigned to improve efficiency and quality when handling long contexts. The modules for image and audio processing have also been updated. Collectively, this makes Gemma 4 an interesting option for so-called «agentic scenarios», where the model doesn't just answer questions but independently executes chains of actions.

Gemma 4 Hardware Support From Data Center to Laptop

From Data Center to Laptop – Everything is Covered

AMD has announced support for Gemma 4 across three tiers of its product line:

  • Instinct GPUs – server accelerators for data centers and corporate infrastructure;
  • Radeon GPUs – graphics cards for workstations and home PCs;
  • Ryzen AI – processors for AI laptops, including those with a dedicated Neural Processing Unit (NPU).

Support is implemented through several popular tools: LM Studio for easy local execution, as well as a number of open-source projects aimed at developers.

Running Gemma 4 in the Cloud and on Servers

Running in the Cloud and on Servers

For server-side scenarios, Gemma 4 can be deployed using two main frameworks: vLLM and SGLang. Both are optimized for high performance when serving many concurrent requests, which is crucial for production environments.

vLLM supports several generations of Instinct and Radeon GPUs. SGLang is tailored for top-tier server accelerators from the MI300X, MI325X, and MI35X series. Notably, the entire Gemma 4 lineup – including the MoE architecture models – fits on a single MI300X accelerator with its 192 GB of memory, even with the full context window. For higher-load scenarios, multiple accelerators can be used in parallel.

Running Gemma 4 on Personal Hardware Explained

Running on Personal Hardware – Easier Than You Think

For those who want to run Gemma 4 locally – on their personal computer or laptop – AMD offers two paths.

The first is through LM Studio. This is an application with a graphical user interface that allows you to download and run the model in just a few clicks. It works with Ryzen AI and Ryzen AI Max processors, as well as Radeon and Radeon PRO cards. For full acceleration, up-to-date AMD Software: Adrenalin Edition drivers are required.

The second path is through Lemonade Server. This is a more flexible option for those who want to interact with the model via an API compatible with the OpenAI format. Lemonade supports acceleration on both the GPU via ROCm and the NPU in Ryzen AI processors.

The NPU Specialized for AI Processing

The NPU: A Story in Itself

The Neural Processing Unit (NPU) in Ryzen AI processors is a specialized chip within the processor, designed specifically for neural network tasks. It consumes significantly less power than a GPU, which is critical for a laptop's battery life.

Support for Gemma 4 on the NPU will arrive with the next Ryzen AI SW update. Initially, two compact models will be available: Gemma-4 E2B and E4B. For developers, this support will be implemented through interfaces like OnnxRuntime, simplifying integration into their own applications.

Why Immediate Gemma 4 Support Benefits Users

Why This Matters for Users

«Day-one» support is not just a marketing gimmick. Previously, users and developers often had to wait weeks or even months for a new model to appear in a user-friendly tool or to work on specific hardware. In this case, AMD synced up with Google's release in advance.

For the average user, this means they can try out the new model immediately – via LM Studio, without waiting for patches or updates. For developers, it means they can start building their own projects on Gemma 4 right away, without worrying about the supporting infrastructure lagging behind.

The open weights of Gemma 4, combined with broad hardware support, make it a viable option for those who want to run powerful language models locally – without cloud dependency and without needing a server rack on hand.

Original Title: Day 0 Support for Gemma 4 on AMD Processors and GPUs
Publication Date: Apr 3, 2026
AMD www.amd.com An international company manufacturing processors and computing accelerators for AI workloads.
Previous Article Stop Teaching Everything at Once: Why AI Models Perform Better When Trained for Specific Tasks Next Article Autiverse: An AI Journal Helping Teens with Autism Share Their Day

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.6 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.6 Anthropic
2.
Gemini 2.5 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 2.5 Pro Google DeepMind
3.
Gemini 2.5 Flash Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 2.5 Flash Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Want to dive deeper into the world
of neuro-creativity?

Be the first to learn about new books, articles, and AI experiments
on our Telegram channel!

Subscribe