Intellectual hub of the topic

audio manipulation

Sound is rarely just a backdrop; more often, it is an object of close study that demands specific tools and a keen perspective. In this selection, we focus on the processes of extracting meaning from the audio stream – from capturing field recordings and restoring archival tapes to complex editing and acoustic analysis. Engaging with the sonic environment is viewed here not as technical routine, but as a method of investigating reality, where the precision of processing directly shapes the depth of material perception.

AI: Events

Typhoon at EACL 2026: Advancing Audio-Language Model Research

Research

Two research papers from the Typhoon team have been accepted to the EACL 2026 conference. They focus on evaluating speech models and handling long audio recordings, addressing key challenges in the field.

Typhoonopentyphoon.ai Mar 21, 2026

AI: Events

A Small Model That Hears Better: Turning a Multimodal AI into an Effective Audio Embedder

Research

Researchers have demonstrated how to transform a large multimodal model into a compact audio tool that surpasses competitors while being trained on 25 times less data.

Jina AIjina.ai Mar 20, 2026

AI: Events

Yandex AI Studio Teaches Agents to Search Files, Including Video and Audio

Products

Yandex AI Studio has updated its file search tool, enabling AI agents to work with tables, audio, and video to find information in corporate knowledge bases.

Yandex Cloudyandex.cloud Mar 19, 2026

AI: Events

How AI Learns to 'Hear' What Matters: Extracting Data from Live Speech in Real Time

Development

We explore how modern speech recognition systems have learned to extract specific data – phone numbers, addresses, and emails – from conversations on the fly.

AssemblyAIwww.assemblyai.com Mar 19, 2026

AI: Events

Universal-3 Pro by AssemblyAI: One Model, Six Languages, No Switching

Products

AssemblyAI has released the Universal-3 Pro model, which supports six languages and allows switching between them mid-speech without manual adjustments.

AssemblyAIwww.assemblyai.com Mar 18, 2026

AI: Events

AssemblyAI Launches Real-Time Streaming Speaker Diarization

Products

AssemblyAI has unveiled technology that can identify which participant is speaking in real time, even in crowded meetings.

AssemblyAIwww.assemblyai.com Mar 17, 2026

AI: Events

Hume AI Open Sources TADA – A Model for Synchronizing Text and Audio

Development

Hume AI has open-sourced TADA, a speech model that performs frame-by-frame alignment of text and audio, making speech synthesis fast and predictable.

Hume AIwww.hume.ai Mar 10, 2026

AI: Events

ElevenLabs Launches In-Browser Audiobook Creation Tool

Products

A new feature in ElevenCreative lets you turn text into a finished audiobook without stepping foot in a recording studio or hiring professional narrators.

ElevenLabselevenlabs.io Feb 10, 2026

AI: Events

Bulbul V3: An Indian Model for Speech Synthesis in 15 Languages

Products

Indian startup Sarvam AI has unveiled Bulbul V3 – a speech synthesis model supporting 15 languages and capable of voice cloning from a short audio sample.

Sarvamwww.sarvam.ai Feb 9, 2026

Don’t miss a single experiment!

Subscribe to our Telegram channel —
we regularly post announcements of new books, articles, and interviews.