Intellectual hub of the topic

scaling

AI: Events

How to Generate 2K Video Fast: The Two-Stage SANA-Video Approach

Research

An MIT team has developed a method for generating 2K video that runs at the same speed as standard 720p generation, utilizing a two-stage processing scheme.

MIT HAN Labhanlab.mit.edu Feb 12, 2026

AI: Events

How to Cut Language Model Training Time by 25% Without Quality Loss

Research

Specialists at AI21 Labs have demonstrated that simple data packing optimization during LLM training allows the process to be significantly sped up without altering the neural network architecture.

AI21 Labswww.ai21.com Feb 12, 2026

AI: Events

Unsloth Speeds Up MoE Model Training 12x and Boosts Context Window

Technical context • Development

Unsloth's new kernels and mathematical optimizations slash memory requirements by 35%, boost training speeds by 12x, and enable context windows six times longer than the original.

Unslothunsloth.ai Feb 11, 2026

AI: Events

Oracle Cools AI Servers Using Water That Never Needs Changing

Infrastructure

Oracle's AI data centers utilize a closed-loop cooling system where water circulates without evaporation or refills: it is filled just once.

Oraclewww.oracle.com Feb 11, 2026

AI: Events

AMD Shows How to Train Large Models Without the Fear of Losing Progress to a Single Crash

Infrastructure

The new pairing of TorchFT and TorchTitan allows model training on AMD GPUs to continue even after cluster node failures – without a full process restart.

AMDwww.amd.com Feb 10, 2026

AI: Events

Perplexity Shows How to Train Trillion-Parameter Models on AWS

Technical context • Infrastructure

The Perplexity team has adapted a framework for training ultra-large neural networks for Amazon's cloud infrastructure. This allowed them to eliminate the rigid dependency on proprietary NVIDIA hardware and utilize standard networking solutions.

Perplexity AIresearch.perplexity.ai Feb 7, 2026

AI: Events

RDMA for Language Models: When Servers Learn to Talk Directly to Each Other

Technical context • Infrastructure

The Perplexity AI team has demonstrated how direct server-to-server data transfer technology helps language models run faster and more efficiently by eliminating bottlenecks in network infrastructure.

Perplexity AIresearch.perplexity.ai Feb 7, 2026

AI: Events

Zyphra Finds a Way to Make Neural Network Attention Mechanisms Faster and More Efficient

Technical context • Infrastructure

Zyphra's new OVQ-attention layer aims to reduce memory and computational overhead when working with long contexts while maintaining high sequence processing quality.

Zyphrawww.zyphra.com Feb 6, 2026

AI: Events

How to Scale vLLM and Avoid Out-of-Memory Errors

Technical context • Infrastructure

The AI21 Labs team shared their experience optimizing vLLM – a popular tool for deploying language models that often faces critical errors due to RAM shortages when scaling.

AI21 Labswww.ai21.com Feb 6, 2026

Want to know about new
experiments first?

Subscribe to our Telegram channel — we share all the latest
and exciting updates from NeuraBooks.