Mistral has introduced Voxtral TTS – an open-weight text-to-speech model that adapts to a voice in seconds and sounds as natural as a human.
Inception Labs has introduced Mercury 2 – a diffusion language model that operates quickly and affordably, paving the way for a new approach to creating AI agents.
AI: Events
How Voice AI Knows When You've Finished Speaking – and Why It's More Important Than You Think
Development
A look at why the “end-of-speech” moment is so hard for voice AI to detect and how errors in this area can ruin the entire user experience.
AI: Events
How to Adapt a Large AI Model for Dozens of Languages and Cultures: The Sakana AI Approach
Research
Japanese lab Sakana AI has developed a technology to adapt large, general-purpose language models for specific languages and cultures.
Researchers have proposed a new approach to evaluating voice AI agents that considers not only the accuracy of answers but also the quality of a live conversation.
The Korean company Upstage has released the Solar Pro 3 language model, which handles multi-step agentic tasks twice as effectively as its predecessor.
The popular method for comparing AI transcription services isn't as objective as it seems – we'll explore where it falls short.
Scale AI has launched Voice Showdown, a benchmark for evaluating voice AI models based on real human preferences and live speech.
Two research papers from the Typhoon team have been accepted to the EACL 2026 conference. They focus on evaluating speech models and handling long audio recordings, addressing key challenges in the field.