We explore the concept of «audio intelligence» and why a machine's ability to understand speech goes beyond mere word transcription.
AssemblyAI has unveiled technology that can identify which participant is speaking in real time, even in crowded meetings.
DynaGuard is a system of AI safety models that evaluates text based on user-defined rules rather than rigid templates.
AI: Events
M4-RAG: When AI Seeks Answers in Images, Not Just Text, and Across Multiple Languages
Research
Researchers have introduced M4-RAG, a large-scale benchmark for evaluating systems that answer questions about images by drawing on external knowledge and operating in multiple languages.
AI: Events
MR3: A Model That Evaluates AI Responses in Dozens of Languages Without Predefined Rules
Technical context • Research
Researchers have introduced the MR3 model, which evaluates the quality of language model responses across multiple languages – without rigid criteria or evaluation templates.
A look at how modern speech recognition and intent analysis systems are transforming contact centers and why it matters to every customer.
Researchers have proposed a new approach to evaluating the quality of AI responses, which, instead of a simple «yes/no», attempts to understand the reasons behind errors.
AI: Events
Speech Recognition in Noise: Why Systems Perform Well in Tests but Fail in the Real World
Development
We explore why speech recognition systems perform well in tests but struggle in real-world conditions with background noise.
Lab
A Voice at the Appointment: Why AI Can't Make Out the Doctor
Electrical Engineering & System Sciences
Researchers tested whether AI systems can comprehend real-world medical conversations – and the results delivered a harsh verdict for the entire industry.