Developers who work in the terminal are used to trusting the command line. It doesn't ask for clarification, explain, or offer solutions – it simply executes commands. However, when an AI assistant works alongside the command line, the dynamic changes slightly: the user now has a conversational partner who can suggest a command, explain an error, or offer a solution. The only question is how much this partner can be trusted initially.
GitHub Copilot CLI has received an update that makes it a bit more cautious – in a good way. The new feature is called Rubber Duck, and its concept is simple: before providing the final answer, the tool seeks a «second opinion» from another language model.
Where Did the Rubber Duck Come From?
Among developers, there's a long-standing practice known as «rubber duck debugging.» The idea is to explain a problem out loud to anyone, even a toy duck on your desk. In the process of explaining, a person often finds the error themselves because they start to think differently, more structurally.
The Rubber Duck feature in Copilot CLI works on similar logic, but at the level of the language models themselves. When a user asks a question, the first model formulates an answer. Then, this answer is «shown» to a second model – from a different family, with a different approach to reasoning. The second model checks if everything is correct and, if necessary, corrects or supplements the result.
Simply put: one AI thinks, the second one double-checks. The final answer is formed by taking both points of view into account.
Why Is a Second Model Needed at All?
Language models, for all their usefulness, are not immune to errors. The same model can give a precise answer to a complex question and yet make a mistake on something seemingly obvious. This isn't a flaw of a specific system – it's a general characteristic of the architecture of large language models.
Different model families were trained differently, on different data, with different priorities. This means that where one model might «miss the mark», another is more likely to be accurate. By combining them, the probability of error can be reduced – not because any single model has become smarter, but because two different perspectives compensate for each other's weaknesses.
In the context of working with the terminal, this is especially relevant. A wrong command can do more than just produce an unexpected result – it can delete files, disrupt system configuration, or run something unforeseen. Therefore, an extra check here has very tangible practical value.
What This Looks Like in Practice
For the user, everything remains familiar: they ask a question in the terminal and get an answer. Internally, an additional verification cycle takes place, but it requires no action from the user. Rubber Duck works in the background – like a «second look» that you don't have to ask for specifically.
This is an important detail: the feature doesn't turn working with the tool into a dialogue between two models that you need to watch and interpret. It simply makes the final answer more well-considered.
A Bit About Where This Is Headed
The idea of combining multiple models to generate an answer is not new in itself. In the research community, approaches where multiple models «vote» on an answer or check each other's work have long been discussed. However, this is only gradually making its way into practical tools integrated directly into a developer's workflow.
Rubber Duck is one of the first examples of such a mechanism appearing not as a separate research prototype, but as part of a real product used every day. And that, perhaps, is more interesting than any technical detail: the idea of double-checking with another model is ceasing to be an academic concept and becoming a common feature in the terminal.
The question of how effectively this particular combination of models works in different scenarios remains open. Two heads are better than one – but only if they are truly independent and sufficiently different. How exactly the models for Rubber Duck are selected and how often the second point of view changes the final result – only time will tell.