AMD has showcased its GP-MoLFormer model, which generates molecular structures. Let's delve into how it works and why it's important.
AI: Events
What Affects Text-to-Image Model Quality: PhotoRoom's Research on Important Training Details
Technical context • Research
The PhotoRoom team verified which decisions in diffusion model training actually help and which can be simplified without losing quality.
The Pruna AI team has accelerated image generation in the FLUX.2 [flex] model threefold without compromising quality. We explain how this was achieved and what it means for users.
AI: Events
How a Single Token Broke an Entire Model: The Story of a vLLM Bug
Technical context • Infrastructure
Engineers at AI21 Labs discovered a bizarre bug in vLLM that turned the Jamba model's normal responses into gibberish – and it was all down to a single incorrect token.
We explore the architectural solutions developers of Chinese open-source models are choosing and why decoder-based approaches continue to dominate the ecosystem.
AI: Events
Trinity Large: What's Inside and Why Arcee Released Three Versions of the Same Model
Technical context • Products
We dive into how Trinity Large from Arcee AI works as a new language model with a sparse architecture and three checkpoints to choose from.
AI: Events
Open Coding Agents: AI Code Assistants That Work With Any Repository
Technical context • Development
The Allen Institute for AI has unveiled Open Coding Agents – open-source models for autonomous coding that adapt to a project's structure.
AMD has introduced Nitro-AR, an autoregressive model that generates images faster than its diffusion counterparts and occupies less memory.
MiniMax has discussed its approach to fine-tuning language models that do more than just answer questions – they execute complex tasks by interacting with tools.