Topic #large language model optimization

AI: Events

SGLang at NVIDIA GTC 2026: A Behind-the-Scenes Look at a Top AI Conference

Technical context • Infrastructure

SGLang was prominently featured at NVIDIA GTC 2026 in multiple formats, from a mention in the keynote to a 200-person meetup and a hands-on lab.

LMSYS ORGlmsys.org Apr 1, 2026

AI: Events

Aurora: How AI Learned to Predict Its Responses and Continuously Improve

Technical context • Infrastructure

Together AI has introduced Aurora – an open-source framework that transforms language model acceleration into a self-learning system, improving on the fly.

Together.aiwww.together.ai Apr 1, 2026

AI: Events

When Documents Are Too Long: How Small Models Can Outperform Large Ones

Research

Researchers have demonstrated that small language models can outperform GPT-4o when processing long texts by breaking down tasks and distributing the work among multiple agents.

Together.aiwww.together.ai Mar 27, 2026

AI: Events

Smart Selectivity: How a Hybrid Neural Network Remembers Only What's Important

Technical context • Research

A new approach to neural network architecture dramatically reduces memory consumption for text processing without sacrificing comprehension quality.

Zyphrawww.zyphra.com Mar 26, 2026

AI: Events

Fault Tolerance in Large Language Models: How DeepSeek Learns to Handle Failures

Technical context • Infrastructure

SGLang developers have introduced a partial fault tolerance mechanism for MoE models – now, the failure of a single node doesn't bring down the entire system.

LMSYS ORGlmsys.org Mar 26, 2026

AI: Events

How to Adapt a Large AI Model for Dozens of Languages and Cultures: The Sakana AI Approach

Research

Japanese lab Sakana AI has developed a technology to adapt large, general-purpose language models for specific languages and cultures.

Sakana AIsakana.ai Mar 24, 2026

AI: Events

TorchSpec: Accelerating Large Language Models Without Sacrificing Quality

Technical context • Development

The PyTorch team has introduced TorchSpec, a tool designed to facilitate the training of speculative decoding, thereby accelerating the performance of large language models.

PyTorchpytorch.org Mar 21, 2026

AI: Events

MR3: A Model That Evaluates AI Responses in Dozens of Languages Without Predefined Rules

Technical context • Research

Researchers have introduced the MR3 model, which evaluates the quality of language model responses across multiple languages – without rigid criteria or evaluation templates.

Capital Onewww.capitalone.com Mar 16, 2026

AI: Events

SGLang Supports New NVIDIA Model from Day One: Implications for AI Agents

Infrastructure

SGLang added support for the NVIDIA Nemotron 3 Super model on the day of its release, simplifying the creation of multi-agent systems based on efficient language models.

LMSYS ORGlmsys.org Mar 12, 2026