The new DRACO benchmark evaluates how accurately, thoroughly, and objectively AI systems handle complex topic exploration across various fields of knowledge.
Microsoft has introduced a method for detecting hidden vulnerabilities in open-source language models, along with a tool for mass scanning.
AI: Events
Why Autonomous AI Needs a Data Platform, Not Just a Large Model
Technical context • Infrastructure
AMD explains why true AI autonomy doesn't start with algorithms, but with a sound data strategy and a unified platform to harness it.
Why generative AI in banking and fintech requires a special approach to data – and where context engineering comes in.
Elastic shared how it uses artificial intelligence to speed up tech support responses, with every answer verified by engineers before being sent to the client.
The L7 Center conducted an independent study of the «Celsus» algorithm on mammography images – the system demonstrated high accuracy in detecting breast pathologies.
Lab
How to Teach a Compressor to Forgive: Why Your Files Won't Unzip Due to a Single Calculation Speck
Mathematics & Statistics
The new PMATIC algorithm solves a sticky problem where the slightest calculation inaccuracy turns a compressed file into digital garbage – all without sacrificing quality.
AI21 Labs explained why creating a model that simply does its job without surprises turned out to be harder than it seems.
Lab
Generalizing Generalization: When Neural Networks Learn to Predict – But Not What We Expected
Computer Science
Let's figure out why a language model's success on one test outside of training doesn't guarantee a win on another – and what this means for real-world AI applications.