Indian developers have unveiled an audio model that doesn't just transcribe speech – it understands the context of the conversation and adapts its output format accordingly.
Indian company Sarvam AI has unveiled a system for automatically dubbing videos into regional languages while preserving the original intonations and synchronizing lip movements.
Version 1.2 expands editing and audio capabilities in the Suno Studio generative workstation, providing users with more control over the final mix.
A Brazilian engineer explains how the new DARC model allows controlling drum rhythm via beatbox without losing musical harmony – much like conducting a samba with hand gestures.
Lab
How to Teach a Neural Network to Shred: From Clean Tone to Full Distortion in 5 Seconds
Electrical Engineering & System Sciences
An Engineer's Take on Morphing Guitar Effects with Neural Networks: From the Math of Spherical Interpolation to Real-World Application at -40°C.
Lab
How We Teach Computers to Distinguish Real Voices from Fake Ones: The Multilingual Deepfake Problem
Electrical Engineering & System Sciences
A new study shows how training an AI system on audio recordings in nine languages helps it more effectively recognize deepfakes.
Разбираемся в технической кухне создания музыки нейросетями – от алгоритмов до готовых треков без романтизации процесса.
Lab
Как заставить искусственный интеллект говорить экономнее: речевые кодеки с переменной частотой
Electrical Engineering & System Sciences
Новая технология речевых кодеков адаптирует частоту обработки под сложность сигнала, экономя ресурсы без потери качества звука.
Диффузионная модель SEED улучшает распознавание голоса в реальных условиях на 19,6% без перестройки систем и меток говорящих.