Published February 7, 2026

Perplexity Shows How to Train Trillion-Parameter Models on AWS

The Perplexity team has adapted a framework for training ultra-large neural networks for Amazon's cloud infrastructure. This allowed them to eliminate the rigid dependency on proprietary NVIDIA hardware and utilize standard networking solutions.

Technical context Infrastructure
Event Source: Perplexity AI Reading Time: 4 – 6 minutes

The Perplexity team has published an article detailing how they managed to adapt training technology for trillion-parameter models to run on the AWS cloud platform. Long story short: they took an existing approach that was hardwired to NVIDIA hardware and rewrote it to work efficiently on Amazon's standard network infrastructure.

Why Trillion-Parameter Models Cannot Fit on Single GPUs

The Trillion-Parameter Problem

Modern large language models just keep growing. If a model with 100–200 billion parameters was considered massive a couple of years ago, we are now talking about a trillion or more. The problem is that such models physically do not fit into the memory of a single GPU – not even the most powerful one.

Therefore, they have to be «spread out» across multiple devices. But when the GPU count reaches hundreds or thousands, another complication arises: they need to constantly exchange data with one another. And if this connection is slow, the entire training process turns into an endless waiting game.

NVIDIA NVLink and Megatron-LM for Distributed Model Training

How This Is Usually Solved

NVIDIA offers a technology called NVLink for such tasks. It is a specialized high-speed bus that connects GPUs within a single server or between servers. It works fast, but there is a catch: it is a proprietary solution that requires specific «hardware» and has poor compatibility with other platforms.

There is an open-source framework called Megatron-LM from NVIDIA, which is capable of training huge models by distributing them across many GPUs. However, it was originally designed specifically for NVLink. If you do not have access to this technology, you are, roughly speaking, «out of the game».

Adapting Megatron-LM to Work with AWS EFA Instead of NVLink

What Perplexity Did

The Perplexity team decided to break this dependency. They rewrote parts of Megatron-LM so that the framework could operate via AWS EFA (Elastic Fabric Adapter) – Amazon's networking technology that provides high-speed communication between servers in the cloud. EFA uses a standard protocol that is not tied to a specific hardware vendor.

Now, trillion-parameter models can be trained on standard AWS cloud instances without requiring specific equipment from NVIDIA. This makes the process more flexible: you can rent capacity from Amazon, train the model, and not worry about the infrastructure being locked into a single vendor.

Benefits of Training Large Models on Standard Cloud Infrastructure

Why This Matters 🤔

First, it lowers the barrier to entry. While training ultra-large models previously required either purchasing expensive servers with NVLink support or renting them from a narrow circle of providers, it is now possible to use publicly available cloud infrastructure.

Second, it is a matter of portability. When a framework works with only one technology, you effectively become its hostage. If a better offer from another cloud provider appears tomorrow, moving the training process there would be difficult or even impossible. Perplexity's solution makes development less dependent on a specific supplier.

Third, it opens up new opportunities for researchers and smaller teams who may not have the budget for exclusive hardware but do have access to major cloud platforms.

Technical Changes to Megatron-LM Communication Layer

Under the Hood

Without diving too deep into the technical weeds: the primary work involved replacing the communication layer. Megatron-LM uses NCCL (NVIDIA Collective Communications Library) – a library for data exchange between GPUs. This library is optimized for NVLink and can exhibit low performance on other types of connections.

The Perplexity team adapted the framework to use AWS EFA efficiently. According to them, this required rethinking some data distribution and synchronization algorithms, but they eventually achieved performance sufficient for training models at a trillion-parameter scale.

Performance Trade-offs and Platform Portability Concerns

Limitations and Questions

It is important to understand that this is not a «magic bullet». Perplexity does not claim that their approach is faster or more efficient than training via NVLink. Rather, it is a compromise: you gain greater flexibility and hardware independence, but you might sacrifice some «raw» performance.

There also remains the question of how easily this approach scales to other cloud platforms. AWS EFA is still a proprietary solution from one specific provider. If someone wants to repeat a similar trick on Google Cloud or Azure, additional adaptation for their network protocols will be required.

Finally, Perplexity's article is more of a description of a concept and an architectural approach than a ready-to-use open-source tool. It is still unclear whether the company plans to release the code to the public or if it will remain an internal development.

Moving Away from Vendor Lock-in in AI Model Training

What This Means for the Industry

Perplexity's work proves that dependency on closed technologies is «not a death sentence». Even in resource-intensive tasks like training trillion-parameter models, one can find paths toward greater openness and cross-platform compatibility.

This is especially relevant now, as the cost of training neural networks continues to rise and competition between cloud giants intensifies. The ability to choose a platform without being tied to specific «hardware» could be a deciding factor for many developers.

We will see if other companies follow this example and how widely such an approach takes root in the industry in the coming years.

Original Title: Enabling Trillion-Parameter Models on AWS EFA
Publication Date: Feb 6, 2026
Perplexity AI research.perplexity.ai A U.S.-based company developing an AI-powered search engine with source-based answers.
Previous Article SenseTime Unveils SenseNova-SI-1.3: A Model Featuring Advanced Spatial Intelligence Next Article Model Context Protocol: How to Connect AI to Real Data

From Source to Analysis

How This Text Was Created

This material is not a direct retelling of the original publication. First, the news item itself was selected as an event important for understanding AI development. Then a processing framework was set: what needs clarification, what context to add, and where to place emphasis. This allowed us to turn a single announcement or update into a coherent and meaningful analysis.

Neural Networks Involved in the Process

We openly show which models were used at different stages of processing. Each performed its own role — analyzing the source, rewriting, fact-checking, and visual interpretation. This approach maintains transparency and clearly demonstrates how technologies participated in creating the material.

1.
Claude Sonnet 4.5 Anthropic Analyzing the Original Publication and Writing the Text The neural network studies the original material and generates a coherent text

1. Analyzing the Original Publication and Writing the Text

The neural network studies the original material and generates a coherent text

Claude Sonnet 4.5 Anthropic
2.
Gemini 3 Pro Google DeepMind step.translate-en.title

2. step.translate-en.title

Gemini 3 Pro Google DeepMind
3.
Gemini 3 Flash Preview Google DeepMind Text Review and Editing Correction of errors, inaccuracies, and ambiguous phrasing

3. Text Review and Editing

Correction of errors, inaccuracies, and ambiguous phrasing

Gemini 3 Flash Preview Google DeepMind
4.
DeepSeek-V3.2 DeepSeek Preparing the Illustration Description Generating a textual prompt for the visual model

4. Preparing the Illustration Description

Generating a textual prompt for the visual model

DeepSeek-V3.2 DeepSeek
5.
FLUX.2 Pro Black Forest Labs Creating the Illustration Generating an image based on the prepared prompt

5. Creating the Illustration

Generating an image based on the prepared prompt

FLUX.2 Pro Black Forest Labs

Related Publications

You May Also Like

Explore Other Events

Events are only part of the bigger picture. These materials help you see more broadly: the context, the consequences, and the ideas behind the news.

AI: Events

How to Scale vLLM and Avoid Out-of-Memory Errors

Technical context Infrastructure

The AI21 Labs team shared their experience optimizing vLLM – a popular tool for deploying language models that often faces critical errors due to RAM shortages when scaling.

AI21 Labswww.ai21.com Feb 6, 2026

Want to know about new
experiments first?

Subscribe to our Telegram channel — we share all the latest
and exciting updates from NeuraBooks.

Subscribe