Information retrieval is one of those tasks where AI has long gone beyond simply assisting and now handles the primary workload. However, most search models have limitations: they either work only with text, understand a limited set of languages, or can search within a single modality but struggle when it comes to connecting images and words. Mixedbread decided to combine all of these capabilities into a single model and introduced Wholembed v3.
One Model Instead of Several
Simply put, Wholembed v3 is a unified search model capable of working with multiple data formats at once: text and images. Moreover, it understands queries and documents in different languages, eliminating the need for separate solutions for each case.
Previously, if you needed to set up a search across a multilingual database containing both images and text, you had to either combine several highly specialized models or make a compromise – for example, sacrificing language coverage for image support. Wholembed v3 is positioned as the ideal answer to this scenario: a single model that handles everything at once.
What Does an «Omnimodal» Model Mean?
The word «omnimodal» in the model's description means that it perceives and matches different types of input data – not just text with text, but also text with images. For example, you can submit a text query and receive relevant images in response, or vice versa – provide an image and find suitable text descriptions.
This is useful in a wide variety of situations: from searching a product catalog to systems where documents contain a mix of text and visual content, such as slides, infographics, or scanned pages.
Multilingualism Without Caveats
A special emphasis in Wholembed v3 is its multilingual support. The model is trained to work with a large number of languages, enabling the creation of search systems that are not tied to English as the primary language. This is crucial for users from different countries and for companies operating in international markets: there's no need to additionally translate queries or maintain separate indexes for each language.
A Claim for Best-in-Class Results
Mixedbread claims that Wholembed v3 sets a new quality benchmark for search tasks – across languages, modalities, and real-world use cases. This is a rather bold statement, but it aligns with the company's focus: Mixedbread specializes specifically in search and retrieval models, and Wholembed v3 is their new-generation flagship product.
In short: the model's goal is not just to perform well in laboratory conditions, but to deliver results on real-world data, where queries can be in different languages and documents in various formats.
Who Might Need This?
First and foremost, it's for developers and teams building search engines, RAG pipelines (where a model first retrieves relevant snippets from a knowledge base and then generates an answer based on them), or any application that needs to find relevant content based on a query.
Until now, such systems often required selecting several separate models for different tasks. Wholembed v3 offers a different approach: instead of a chain of specialized solutions, it provides a single model that covers all major search scenarios.
How justified this is in practice, only time and real-world application experience will tell. But the direction itself is clear: to simplify search infrastructure without sacrificing quality.