AMD has published benchmark results for its new Instinct MI355X GPU in neural network inference tasks, covering both single-node and distributed system performance.
Discover how Arabic evolved from a secondary tier in AI to an equal partner, and why this matters for more than just Arabic-speaking users.
Lab
Generalizing Generalization: When Neural Networks Learn to Predict – But Not What We Expected
Computer Science
Let's figure out why a language model's success on one test outside of training doesn't guarantee a win on another – and what this means for real-world AI applications.
Anthropic has released Claude Opus 4.5, an updated flagship model that runs faster and handles complex tasks more effectively than its predecessors.
Учёные создали тест для ИИ на понимание фильмов: модели должны отличать правду от лжи о сюжете, но пока проваливаются даже на «Касабланке».
Новые модели RRM учатся рассуждать перед выставлением оценки, как умный учитель, который объясняет каждую отметку – и работают точнее обычных судей на 15-20%.