When a company plans to implement an AI search for its internal documents, a challenging question almost immediately arises: how can you verify the system's effectiveness without sharing your data with third-party organizations? Demonstrating with real corporate files is risky. Testing with abstract examples yields results that say little about performance in real-world conditions.
It is in this context that EDiTh emerged – an open-source benchmark from the company LightOn, designed specifically for evaluating corporate search.
Definition and Importance of AI Benchmarks for Corporate Search
What is a benchmark and why is it needed?
Simply put, a benchmark is a set of tests with known correct answers. By running a system against it, you can see how accurately it performs its tasks and compare different solutions on the same scale.
In the world of consumer AI tools, there are many such tests. But corporate search is a different story. Here, we aren't talking about finding an article on the internet, but about answering a specific question from a manager based on a company's internal documentation: contracts, reports, policies, and correspondence. Such data is not typically made public, so a proper standard for evaluating these systems hasn't existed until now.
Challenges of Evaluating Enterprise Search Systems
Why was this difficult to solve before?
Companies looking to select or evaluate a corporate search system found themselves in a bind. They could either test the product on their real documents, which raised security and confidentiality concerns, or they could use public datasets, but then the results wouldn't reflect how the system would perform with their actual work materials.
LightOn's goal was to create a compromise solution: documents that feel like real corporate files but contain nothing real or sensitive. In other words, it's a convincing simulation that allows for honest testing.
Key Features and Structure of the EDiTh Benchmark
What's inside EDiTh?
EDiTh is built on synthetic documents – that is, they are generated rather than taken from real archives. But this isn't just random text. The documents mimic typical corporate formats: internal reports, business letters, policies, and financial summaries. In terms of structure and content, they are very similar to what employees work with every day.
These documents come with a set of questions – the kind that managers or analysts would actually ask. Not “find the word,” but rather “what does the contract say about liability periods?” or “what risks are mentioned in the quarterly report?” It is these kinds of questions that pose a real challenge for search systems.
For each question, there is a correct answer, which allows for an objective assessment of how well the system is performing. This is the essence of a benchmark.
Benefits of Open Source Standards in AI Evaluation
Openness – A Deliberate Choice
EDiTh is distributed as an open-source tool. This is a fundamental choice: a closed benchmark, available only within one company, does not create a common standard. Openness allows any team – developers, researchers, or corporate users – to test their system on it and compare their results with others.
This is important for the industry. When all market players have a common test, the conversation about product quality becomes substantive. You can do more than just say “our system is better,” you can show concrete numbers on the exact same set of tasks.
Target Audience and Use Cases for EDiTh
Who Needs This and Why?
In short – anyone who makes decisions about implementing AI in a corporate environment.
For technical teams, it's an evaluation tool: they can check how a particular search model performs on tasks similar to real work scenarios. For business leaders, it's a way to ask a vendor a specific question: “Show me the results on EDiTh,” instead of just taking marketing promises at face value.
It's also particularly useful for managers who want to understand a system's capabilities but are not willing to hand over internal documents to third-party companies for testing. EDiTh removes this barrier: the test is public, and no private data is needed.
Impact of Standardized Benchmarks on the AI Search Market
What Does This Change in a Broader Sense?
Corporate AI search is one of those areas where the gap between promises and reality is still quite wide. There are many products, quality is hard to verify, and everyone has their own evaluation criteria.
The emergence of an open industry benchmark is a step toward making this market more transparent. It's not a revolution, but it is a tangible shift: when everyone uses the same ruler, measurement becomes simpler.
Of course, synthetic documents are not the same as real corporate archives. A system that performs well on EDiTh might behave differently within a specific company with its unique terminology and formats. A benchmark is a guide, not a guarantee.
But having a guide is better than having none at all. And that is precisely what LightOn is now offering the market.