A paradox exists in the world of AI development: the more automation tools are created, the more routine work remains for the developers themselves. Reports, data analysis, repetitive tasks – none of this disappears, even when you're part of one of the industry's most advanced teams.
This is the exact situation an engineer from the GitHub Copilot Applied Science team faced. His job is to improve GitHub Copilot, an AI-powered code-writing tool. Part of this work turned out to be quite mechanical. So he decided, «If I work with agents, why not assign an agent to do what's taking up my time?»
What an «Agent» Means in This Context
Before we continue, it's worth clarifying a term. In the context of AI, an agent is not just a chatbot that answers questions. It's a system that knows how to act: run code, access files, execute a sequence of steps, check the results, and correct them if necessary. Simply put, an agent is an AI you can give a task to, not just a question.
An approach called agent-driven development is gaining popularity in software development today. The idea is that instead of manually writing every line of code, a developer describes the task, and the agent takes on a significant portion of its implementation.
The Agent That Created an Agent
In the case described, the author used AI-assisted coding tools – specifically, GitHub Copilot's agent mode capabilities – to write his own scripts and small automated systems. These systems, in turn, took over some of his everyday work tasks.
It sounds a bit recursive – and it is. An agent helps write another agent that automates the work of the person creating agents. But this experience proved to be particularly insightful, as the author observed the process from the inside and could clearly see where the AI performed well and where it didn't.
What Practical Work with an AI Agent Teaches You
One of the main takeaways is that the quality of the task is more important than the quality of the prompt. Many people think the key to working with AI is formulating the right prompt. While that's important, it's even more crucial to properly break down the task. If a task is too vague or too large, the agent starts to «fantasize» or produces something that's close to what's needed, but not quite right.
When a task is well-structured – with clear inputs, an expected outcome, and understandable constraints – the agent performs significantly better. This, by the way, also holds true for human employees. It just becomes particularly obvious with AI because it won't ask clarifying questions the way a person would.
Iteration: Not a Flaw, but a Feature
The second important lesson has to do with expectations. Working with an AI agent is not a case of «write a prompt, get a finished product.» It's an iterative process: you try something, review the result, refine the task, and try again.
In a sense, it's like working with a junior developer who is very fast but needs clear instructions and supervision. The author notes that he started treating his interaction with the agent as a collaboration rather than just issuing commands. This changed both his approach and the result.
When the Agent Succeeds and When It Fails
There are tasks that AI agents handle with great confidence: writing boilerplate code, processing data according to predefined rules, generating documentation, and performing sequences of similar operations. The time saved here is real and tangible.
But there are areas where the agent is still unreliable. These include tasks with vague success criteria or situations that require making a non-obvious decision based on unwritten context. Or cases where an error in the middle of a sequence of actions quietly «propagates» – and the result looks plausible but turns out to be incorrect.
This is precisely why oversight of the result remains with the human. Automation doesn't eliminate review – it simply shifts the focus from «doing» to «reviewing and guiding.»
What This Means for Those Who Work with Code
The experience described in this case is interesting not as a success story of a specific engineer, but as an illustration of a broader shift in how software development is approached.
Tools like GitHub Copilot have been helping write code line by line for quite some time. But now, the focus is shifting toward more autonomous work: an agent can take on a task, write the code, run tests, fix errors, and return a finished result. The human, in this process, acts more as a task manager and reviewer than an executor.
This changes not only the tools but also the key skills required. The ability to clearly formulate tasks, decompose complex problems, ask the right questions, and critically evaluate the outcome is becoming more important than just knowing the syntax of a specific programming language.
Open Questions
Of course, not everything is smooth sailing, and the author admits this. It remains unclear how well this approach scales to larger, more complex projects. An agent that excels at a small, self-contained task may struggle with a large codebase that has a long history and intricate dependencies.
There is also the question of the code quality the agent generates: it may function correctly but be hard to read or difficult to maintain. This might not be an immediate problem, but it could become one a year later when someone else has to understand that code.
Finally, there's the broader question of how the developer's role is changing. If the agent takes on more and more routine coding, what's left for the human? Judging by this experience, what remains is the most difficult part: understanding context, making decisions in uncertain conditions, and being responsible for the final result.
This, perhaps, is the main conclusion: AI agents don't replace thinking. They free us from routine work so that we can do more thinking.