When code changes quickly – and in startups, it changes very quickly – the security team finds itself in a tough spot. New features are released every week, there are many developers, and manually keeping track of every change is practically impossible. The Cursor security team faced this exact problem and solved it in an unconventional way: instead of hiring more people or slowing down development, they launched autonomous AI agents that find and fix vulnerabilities in the code on their own.
Why Bother with Agents When Conventional Tools Exist?
Traditional security analysis tools operate based on predefined rules. Simply put, they are good at finding what is already known: standard vulnerability patterns, typical configuration errors. But real-world code is not a textbook. The logic can be convoluted, context is crucial, and the same construct can be dangerous in one place and completely harmless in another.
An AI agent, in this sense, is more like a human reviewer: it reads the code, understands the context, can «walk through» a call chain, and assess whether a given situation truly poses a threat. The difference is that an agent doesn't get tired, isn't distracted, and can work in parallel – that is, it can simultaneously check many different parts of the codebase.
A Fleet of Agents – Not a Metaphor
The Cursor team has built exactly what can be called a fleet: not a single agent, but an entire system of specialized agents, each responsible for its own domain. Some look for vulnerabilities in authorization logic, others check how user data is handled, and still others monitor how the code interacts with external services.
This approach allows for broader and deeper coverage of the codebase than a single, all-purpose tool could provide. The agents don't run just once; they work continuously. As changes are made to the codebase, they are triggered again to check if anything suspicious has appeared.
Finding Is Half the Battle. What Comes Next Is More Interesting
Many security tools stop right there: they find a problem and report it. From there, it's up to the developer to figure it out. Cursor went a step further: their agents don't just flag a problem; they also suggest a fix – and in some cases, apply it automatically.
This fundamentally changes the team's workload. Instead of a long list of «here's what's broken», developers receive concrete pull requests with ready-made fixes. The human's task is to review, approve, or adjust the changes. This doesn't mean humans are cut out of the loop – quite the opposite. Because the routine work is automated, security specialists can focus on genuinely complex and non-trivial cases.
Why It Works Specifically for Them
Cursor is a code editor with deep AI integration. Their own product is built around AI agents that understand and work with code effectively. Using these same capabilities for internal security is, if you will, a case of «eating your own dog food.» The team has a deep understanding of how the agents work internally, what their strengths are, and where they might go wrong.
This is important because agents are not a magic wand. They can generate false positives, miss unconventional cases, or suggest fixes that are technically correct but break the application's logic. That's why human oversight is maintained in the Cursor system: the agent proposes, the human decides.
What This Means for the Industry as a Whole
What the Cursor team has done is not some unique breakthrough available only to them. It is, rather, a demonstration of an approach that is becoming increasingly realistic for many companies. Previously, security automation meant writing rules and scripts. Now, it's about training agents that can understand code and act autonomously.
For small teams, this is especially relevant: hiring an entire security department is expensive, and vulnerabilities aren't going anywhere. In this scenario, AI agents act not as a replacement for specialists, but as a force multiplier – allowing a small team to do the work that once required a large one.
The question of trust remains open. How much can you rely on an agent that makes changes to production code on its own? Cursor addresses this through strict control and reviews, but as agents become smarter, this balance will shift. Where exactly to draw the line between «agent proposes» and «agent decides» – is one of the key questions the industry is just beginning to tackle.