Intellectual hub of the topic

ai training

AI: Events

How to Train Large Language Models Without Constantly Babysitting the Terminal

Technical context • Infrastructure

AMD demonstrates how to set up LLM training on GPU clusters so that failures are handled automatically, eliminating the need for manual intervention.

AMDwww.amd.com Mar 4, 2026

AI: Events

How to Train an Image Generation Model in 24 Hours: The Photoroom Team's Experience

Development

The Photoroom team shares how they managed to train their own image generation model in just 24 hours and what the results were.

Hugging Facehuggingface.co Mar 4, 2026

AI: Events

Instant Neural Network Updates: How Doc-to-LoRA and Text-to-LoRA Are Changing the Game

Research

Sakana AI has proposed a method to instantly update the knowledge of language models without costly retraining – by generating adapters directly from text.

Sakana AIsakana.ai Mar 2, 2026

AI: Events

What Is a Mixture of Experts and Why Is Everyone Talking About It?

Development

We explain how the Mixture of Experts architecture works – an approach that makes models smarter without making them 'think' harder.

Hugging Facehuggingface.co Feb 26, 2026

AI: Events

How to Make Small Language Models Think Better: AMD's Experience with Synthetic Data

Development

AMD has introduced LuminaSFT, an approach that uses synthetic data to fine-tune small language models and achieve surprisingly high performance.

AMDwww.amd.com Feb 26, 2026

AI: Events

JAX-AITER: How AMD Is Simplifying Fast AI Model Development on Its GPUs

Development

AMD has released JAX-AITER, a library of pre-built, optimized computational blocks for developing large AI models on AMD GPUs using the JAX framework.

AMDwww.amd.com Feb 26, 2026

AI: Events

Zero Bubbles and Flexible Pipelines: How AMD Accelerates Large Language Model Training

Technical context • Infrastructure

AMD has unveiled Primus, an implementation of pipeline parallelism for large model training that eliminates idle time and flexibly adapts to various tasks.

AMDwww.amd.com Feb 24, 2026

AI: Events

Tencent Hunyuan Reveals How to Pinpoint Bottlenecks in Language Model Training

Development

Researchers from Tencent have developed a tool that helps to precisely identify where failures occur during reinforcement learning model training.

Tencenthunyuan.tencent.com Feb 14, 2026

AI: Events

Olmix: Allen AI's Approach to Data Mixing Across All Stages of Language Model Training

Development

Allen AI has introduced Olmix, an open-source framework for data mixing in the language model training process, including pre-training, instruction tuning, and alignment.

Ai2allenai.org Feb 13, 2026

Don’t miss a single experiment!

Subscribe to our Telegram channel —
we regularly post announcements of new books, articles, and interviews.