Most powerful AI models today operate remotely on servers in data centers that we access via the internet. This works well as long as there is a stable connection and latency isn't a dealbreaker. But what if you need the model right there on the device: in a robot, a drone, a security camera, or augmented reality glasses? That is where a completely different story begins.
Reka has released a new version of its Reka Edge model. Based on the description, it is an attempt to answer that exact question: how to make a serious AI compact and fast enough to function where the cloud is unavailable or its use is simply inefficient.
Why Do We Need «Edge» AI Anyway?
In the tech world, the term «edge» refers to edge computing – processing data directly on the end device or close to it, rather than in the cloud. Simply put: the calculations happen right where the result is needed, without sending data to a remote server.
This is important for several reasons. First, speed: sending a request to the cloud and waiting for a response takes time. For an autonomous car or a robot that needs to react in a split second, such a delay can be critical. Second, privacy: if a device processes images or video locally, the data never leaves its bounds. Third, cost: cloud computing incurs expenses that become quite significant at scale.
Reka Edge was built specifically for these scenarios, with a focus on «physical AI» tasks: robots, drones, smart cameras, and similar systems that must understand the surrounding world in real time.
What This Model Can Do
In essence, Reka Edge is a multimodal model that can «see»: it analyzes images and videos, recognizing and localizing objects within the frame. Furthermore, it is capable of interacting with tools and APIs, meaning it doesn't just describe what it sees but can also take actions based on that information.
Key areas where the model demonstrates high performance include:
- Video and Image Understanding. In industry benchmarks, Reka Edge outperformed models of comparable size in tasks requiring the analysis of frame sequences and multi-image scenes.
- Object Detection. The model accurately finds and anchors objects to coordinates in an image, which is especially vital for robotics and autopilots.
- Tool Use. Reka Edge shows high accuracy in tests for autonomous interaction with interfaces – an essential quality for agentic systems.
- Hallucination Resistance. According to reliability tests, the model is less prone to factual errors or inventing non-existent details compared to its peers.
According to the company, Reka Edge's results were close to those of the much larger Gemini 3 Pro model, despite its fundamentally smaller size.
Small but Nimble 🚀
The model size is 7 billion parameters. It's no giant: for comparison, flagship models boast hundreds of billions of parameters. However, compactness here is a deliberate choice rather than a limitation.
One of the key engineering ideas is that the model processes images significantly more efficiently than its counterparts. For a standard 1024×1024 pixel image, it requires approximately three times fewer tokens (internal processing units) than other models of a similar size. This directly impacts both speed and operating costs.
By the numbers: the model processes over 5 images per second in streaming mode – more than twice as fast as its closest competitors in its class. The time to first response is about half a second. For interactive applications where a user expects an instant reaction, this is a substantial metric.
Which Devices Can Run It?
Reka Edge is designed for a wide range of hardware: from servers and cloud infrastructure to specific hardware platforms – NVIDIA Jetson (a popular solution for robotics), Apple computers with Apple Silicon chips, standard Windows and Linux PCs, as well as smartphones and wearables based on Qualcomm Snapdragon.
It is also worth mentioning how it handles limited memory. In its standard form, the model takes up about 13 GB. By applying compression (quantization), this footprint is reduced to 5 GB – by nearly two-thirds – while maintaining over 98% of its performance quality. This opens up the possibility of running it on devices where memory is physically limited.
Where Could This Be Useful?
Reka highlights several application areas. First is physical AI: robots, drones, and automotive systems that need to understand their environment instantly. Second is media analytics: automatic video captioning, logo detection, and archive tagging. Third is augmented reality: smartphone and smart glass apps that require rapid environmental analysis. Fourth is automation based on visual input: agentic systems that read information from a screen or camera and perform actions in response.
To put it simply: the model is useful anywhere that requires a «see-understand-act» loop where the cloud is unavailable, too slow, or too expensive.
Reading Between the Lines
Reka honestly stipulates that comparisons with Gemini 3 Pro were conducted via API, while other models were run locally on a single cluster – meaning testing conditions were not identical. This is an important note to keep in mind when interpreting the figures.
Furthermore, the stated application scenarios (robots, drones, smart glasses) remain relatively niche fields. Real-world practice will show how much Reka Edge becomes an in-demand tool. The company has already opened access to the model through its own service, API, and an open repository, so it can be tested on real tasks right now.
In a broader sense, what is happening reflects a general trend: the industry is moving toward efficiency, not just scaling up. The question of «how to make a model more powerful» is being joined by «how to make it fit for real-world conditions», and Reka Edge is one of the most compelling answers to date.