There's a class of tasks that has long been considered particularly challenging for AI: not writing text or solving equations, but simply operating a computer – opening applications, switching between them, and executing multi-step instructions just as a human employee would. It's in this area that Hcompany has introduced its new model, Holo3.
On April 1, 2026, Holo3 achieved a score of 78.85% on the OSWorld-Verified benchmark – the highest score among all systems tested in this standard computer operation evaluation. Simply put, this test is considered the industry's primary measure of how well an AI agent can operate in a real desktop environment.
Powerful, Yet Affordable
One of the less obvious aspects of this story is the model's economics. In its primary version (122B-A10B), Holo3 uses only 10 billion active parameters out of a total of 122 billion. This architectural design allows the model to run significantly more cheaply than large proprietary systems, such as GPT 5.4 or Opus 4.6, to which Hcompany compares it.
Additionally, there is a lightweight version, Holo3-35B-A3B, whose weights are published open-source under the Apache 2.0 license. This means developers can use the model freely, including in commercial projects. Both versions are available via the company's Inference API, with the smaller one also offered on a free plan.
How the Model “Learns” to Use Interfaces
Behind these results is a specialized training approach that Hcompany calls the agentic flywheel. The idea is not just to train the model on static examples, but to build a continuous feedback loop that hones two key skills: the ability to perceive what's happening on the screen and the ability to make decisions about the next step.
In practice, it works like this: the model is trained on generated examples of interface navigation, both human-created and synthetic. These scenarios are then programmatically expanded to ensure the model doesn't get lost in unexpected situations. The final stage involves meticulous data filtering and reinforcement learning, which helps get the most out of every training example.
The goal of this approach is not just to make the model adept at specific applications, but to develop a generalized understanding of how digital interfaces work as a whole. This is crucial because the corporate software landscape is extremely diverse, and an agent trained only on familiar tools would be useless when encountering something new.
A “Factory” for Synthetic Environments
To transfer these skills to real-world work scenarios, Hcompany developed its own infrastructure: the Synthetic Environment Factory. This is a system that automatically recreates corporate software environments: websites, tools, and workflows. All of this is built from scratch by other AI agents that program the environment according to given specifications.
The result is a virtually unlimited set of training situations, from simple tasks in a single application to complex, multi-step scenarios where the agent needs to work with several systems at once.
To assess the model's real-world corporate readiness, the company also created its own set of tests: the H Corporate Benchmarks. It includes 486 tasks divided into four categories: e-commerce, business applications, collaboration tools, and scenarios requiring the simultaneous use of multiple programs.
One example of a high-difficulty task is for the agent to extract equipment prices from a PDF file, match them against each employee's remaining budget, and automatically send out personalized approval or rejection emails. This isn't an abstract logic test – it's literally what a real employee might do at a real company.
What's Next
Hcompany states plainly that Holo3 is a milestone, not the final destination. The next stage on their roadmap is called Adaptive Agency. The idea is that future models will not only be able to work with software they already know but also figure out new, unfamiliar corporate systems on their own – in real-time, without prior training on them.
If Holo3 can masterfully handle interfaces it has seen before, the next generation must be able to adapt to what it has never seen. This is a fundamentally different level of autonomy, and it seems to be precisely what Hcompany is aiming for with its concept of the Autonomous Enterprise.
For now, Holo3 is the first publicly verified demonstration that an agentic AI, trained on synthetic corporate environments, can outperform much larger models on tasks that closely resemble real office work. How well this result will translate into the daily practices of specific companies is a question that only real-world application will answer.