AMD has shared how to automate failure diagnostics in large-scale AI model training using an LLM-based agent system.
Hume AI has open-sourced TADA, a speech model that performs frame-by-frame alignment of text and audio, making speech synthesis fast and predictable.
AI: Events
On-the-Fly 'Brain Swap': Tencent Teaches AI Models to Adapt to New Tasks in Real Time
Research
Researchers from Tencent Hunyuan have proposed a new approach to AI model adaptation: generating new parameters in real time without retraining or replacing existing weights.
OpenAI has introduced GPT-5.4, a new model with enhanced capabilities for working with code, tools, and extensive volumes of text.
Researchers have demonstrated how to fine-tune AI models to replace complex physics simulations – making them faster and cheaper than running calculations from scratch.
AI: Events
OLMo Hybrid: Transformers and Recurrent Networks Join Forces
Technical context • Research
Allen AI has introduced OLMo Hybrid, an open language model that combines two architectures for more efficient processing of long texts.
AI: Events
DeepSpeed Learns to Train Complex AI Models More Efficiently: What's Changed and Why It Matters
Technical context • Development
DeepSpeed has received two significant updates: support for training multimodal models and a memory-saving mode using low-precision computations.
NXP and Hugging Face explain how to train robotic artificial intelligence on custom data and run it on a low-power embedded device.
AI: Events
How AMD Optimizes Recommendation Model Training: A Simple Guide to a Complex Task
Technical context • Infrastructure
AMD has shared its approach to simplifying the training of recommendation systems on its GPUs – the algorithms that select movies, products, and news for us.