Published on April 3, 2026

Gemma 4 on AMD: Immediate Hardware Support

Gemma 4 on AMD: Day-and-Date Support on Release

Google has released the Gemma 4 family of open models, and AMD has provided immediate support on release day across its entire hardware spectrum, from data centers to laptops.

Infrastructure / Technical context 4 – 6 minutes min read

Event Source: AMD 4 – 6 minutes min read

When Google releases a new family of open models, the question «but what can we run this on?» arises almost immediately. With Gemma 4, AMD aimed to address this in advance: support for the entire new model lineup was available on release day. This doesn't just cover server hardware but also consumer GPUs and laptop processors.

What is Gemma 4 and Why is it Interesting?

Gemma 4 is a family of four open models from Google, varying in size and architecture. The most compact model has approximately 2 billion active parameters, while the largest has 31 billion. Some models are built using a classic «dense» architecture, while others use a «Mixture of Experts» approach. In simple terms, the model activates only the necessary portion of its «knowledge» depending on the task, which helps save computational resources.

The models are multimodal: they work with text, images, and some variants also handle audio. The context window reaches 256,000 tokens – an impressive amount, roughly equivalent to several thick novels. Claimed strengths include understanding 140 languages, handling code, recognizing text and objects in images, and voice input.

Compared to the previous generation, Gemma 3, the architecture has been redesigned to improve efficiency and quality when handling long contexts. The modules for image and audio processing have also been updated. Collectively, this makes Gemma 4 an interesting option for so-called «agentic scenarios», where the model doesn't just answer questions but independently executes chains of actions.

Gemma 4 Hardware Support From Data Center to Laptop

From Data Center to Laptop – Everything is Covered

AMD has announced support for Gemma 4 across three tiers of its product line:

Instinct GPUs – server accelerators for data centers and corporate infrastructure;
Radeon GPUs – graphics cards for workstations and home PCs;
Ryzen AI – processors for AI laptops, including those with a dedicated Neural Processing Unit (NPU).

Support is implemented through several popular tools: LM Studio for easy local execution, as well as a number of open-source projects aimed at developers.

Running Gemma 4 in the Cloud and on Servers

Running in the Cloud and on Servers

For server-side scenarios, Gemma 4 can be deployed using two main frameworks: vLLM and SGLang. Both are optimized for high performance when serving many concurrent requests, which is crucial for production environments.

vLLM supports several generations of Instinct and Radeon GPUs. SGLang is tailored for top-tier server accelerators from the MI300X, MI325X, and MI35X series. Notably, the entire Gemma 4 lineup – including the MoE architecture models – fits on a single MI300X accelerator with its 192 GB of memory, even with the full context window. For higher-load scenarios, multiple accelerators can be used in parallel.

Running Gemma 4 on Personal Hardware Explained

Running on Personal Hardware – Easier Than You Think

For those who want to run Gemma 4 locally – on their personal computer or laptop – AMD offers two paths.

The first is through LM Studio. This is an application with a graphical user interface that allows you to download and run the model in just a few clicks. It works with Ryzen AI and Ryzen AI Max processors, as well as Radeon and Radeon PRO cards. For full acceleration, up-to-date AMD Software: Adrenalin Edition drivers are required.

The second path is through Lemonade Server. This is a more flexible option for those who want to interact with the model via an API compatible with the OpenAI format. Lemonade supports acceleration on both the GPU via ROCm and the NPU in Ryzen AI processors.

The NPU Specialized for AI Processing

The NPU: A Story in Itself

The Neural Processing Unit (NPU) in Ryzen AI processors is a specialized chip within the processor, designed specifically for neural network tasks. It consumes significantly less power than a GPU, which is critical for a laptop's battery life.

Support for Gemma 4 on the NPU will arrive with the next Ryzen AI SW update. Initially, two compact models will be available: Gemma-4 E2B and E4B. For developers, this support will be implemented through interfaces like OnnxRuntime, simplifying integration into their own applications.

Why Immediate Gemma 4 Support Benefits Users

Why This Matters for Users

«Day-one» support is not just a marketing gimmick. Previously, users and developers often had to wait weeks or even months for a new model to appear in a user-friendly tool or to work on specific hardware. In this case, AMD synced up with Google's release in advance.

For the average user, this means they can try out the new model immediately – via LM Studio, without waiting for patches or updates. For developers, it means they can start building their own projects on Gemma 4 right away, without worrying about the supporting infrastructure lagging behind.

The open weights of Gemma 4, combined with broad hardware support, make it a viable option for those who want to run powerful language models locally – without cloud dependency and without needing a server rack on hand.

#event #applied analysis #neural networks #ai development #infrastructure #products #open language models #hardware acceleration optimization

Link to Original: https://www.amd.com/en/developer/resources/technical-articles/2026/day-0-support-for-gemma-4-on-amd-processors-and-gpus.html

Original Title: Day 0 Support for Gemma 4 on AMD Processors and GPUs

Publication Date: Apr 3, 2026

AMD www.amd.com An international company manufacturing processors and computing accelerators for AI workloads.

Previous Article Stop Teaching Everything at Once: Why AI Models Perform Better When Trained for Specific Tasks Next Article Autiverse: An AI Journal Helping Teens with Autism Share Their Day

Gemma 4 on AMD: Immediate Hardware Support

What is Gemma 4 and Why is it Interesting?

Gemma 4 Hardware Support From Data Center to Laptop

Running Gemma 4 in the Cloud and on Servers

Running Gemma 4 on Personal Hardware Explained

The NPU Specialized for AI Processing

Why Immediate Gemma 4 Support Benefits Users

Related Publications

SGLang Supports New NVIDIA Model from Day One: Implications for AI Agents

GPT-5.4 in Microsoft Foundry: A Model for Those Who Want to Act, Not Just Plan

OpenAI's GPT-5.4: A New Model for Professional Work

From Source to Analysis

Neural Networks Involved in the Process

1. Analyzing the Original Publication and Writing the Text

2. step.translate-en.title

3. Text Review and Editing

4. Preparing the Illustration Description

5. Creating the Illustration