Intellectual hub of the topic

infrastructure

AI: Events

25x Inference Speedup: What's Happening with AI Performance on New NVIDIA Hardware

Infrastructure

The new NVIDIA GB300 NVL72 server, paired with the SGLang framework, has demonstrated a 25x performance boost when running language models.

LMSYS ORGlmsys.org Mar 4, 2026

AI: Events

How AMD Is Teaching Neural Networks to Work Together: Ray and ROCm 7 for Large-Scale ML Tasks

Technical context • Infrastructure

AMD has explained how to run distributed ML tasks on GPUs using Ray and ROCm 7 – from model training to creating agent-based systems.

AMDwww.amd.com Mar 4, 2026

AI: Events

Mistral Document AI in Microsoft Foundry: Implications for Document Processing

Products

Mistral Document AI is now integrated into Microsoft Foundry. This solution aims to automate the processing of complex documents, supporting multiple languages and formats.

Microsoftwww.microsoft.com Mar 4, 2026

AI: Events

Qualcomm Unveils AI200 Rack: A Turnkey Solution for Large AI Models

Infrastructure

Qualcomm has introduced a comprehensive infrastructure for running large AI models, featuring a server rack, expansion cards, and a management system as a single integrated solution.

Qualcommwww.qualcomm.com Mar 2, 2026

NeuroBlog

Free AI: Why Algorithms Feed Us for Nothing, and Feed on Us Themselves

Artificial intelligence • AI Development

Neural networks are given away for free, but the price of this gift is the invisible currency of our thoughts, words, and habits – one that is making corporations the gods of a new world.

Tanya Sky Mar 1, 2026

AI: Events

Perplexity Releases Its Own Models for Searching Massive Text Datasets

Products

Perplexity has released two new models for semantic search – designed to quickly and accurately find information across billions of documents.

Perplexity AIresearch.perplexity.ai Feb 27, 2026

AI: Events

A Trillion Parameters on Consumer Hardware: AMD Shows How to Run a Giant Language Model Locally

Infrastructure

AMD has explained how to run a trillion-parameter language model on a cluster of consumer devices – without the cloud or server farms.

AMDwww.amd.com Feb 27, 2026

AI: Events

Offline Tuning in PyTorch: Accelerating Neural Networks Before Their First Run

Technical context • Infrastructure

An exploration of how TunableOp technology enables the pre-selection of optimal parameters for neural networks, and why this is a valuable practice.

AMDwww.amd.com Feb 26, 2026

AI: Events

Cache as a Resource: How Alibaba Cloud Teaches AI Not to Calculate the Same Thing Twice

Technical context • Infrastructure

Alibaba Cloud has introduced a precise request routing mechanism for language models that significantly boosts caching efficiency in distributed inference.

Alibaba Cloudwww.alibabacloud.com Feb 26, 2026

Don’t miss a single experiment!

Subscribe to our Telegram channel —
we regularly post announcements of new books, articles, and interviews.