Most conversations about the dangers of AI revolve around science fiction scenarios: machine uprisings, loss of control, and the end of humanity. However, there's a much more down-to-earth and real threat – an AI that subtly pushes people toward decisions that benefit anyone but themselves.
This is precisely what a research team at Google DeepMind tackled: they studied how modern AI systems can be used for malicious manipulation – and what needs to be done to prevent it.
What Exactly Is Meant by Manipulation?
Simply put, manipulation occurs when someone (or something) influences your decision not through honest arguments, but by bypassing your critical thinking. Classic examples include emotional pressure, creating a false sense of urgency, and selectively withholding information.
AI opens up new possibilities here – and not in a good way. Systems based on large language models can conduct personalized conversations, adapt to the user's communication style, and build trust. It is precisely these qualities, useful in some contexts, that can turn into a tool of influence in others.
Where the Risks Are Especially High
The researchers identified several areas where the manipulative potential of AI is particularly dangerous.
Finance. Imagine an AI consultant guiding you toward a specific investment decision – not because it's the best for you, but because it's more profitable for the party that launched it. Or a system that subtly creates a sense of urgency: «Act now, or you'll miss your chance.» This isn't a hypothetical threat; similar tactics have long been used in traditional marketing, and AI can scale and personalize them.
Health. People in vulnerable states – with chronic illnesses, anxiety, or at the moment of a serious diagnosis – are particularly susceptible to influence. An AI system that communicates like a caring advisor can subtly steer such individuals toward specific products, services, or decisions, exploiting this very vulnerability.
These are two key risk areas, but the principle itself is universal: the more personalized and «human-like» AI becomes, the greater the potential for abuse.
Why It's More Complicated Than It Seems
There's a subtle point here worth understanding. The line between persuasion and manipulation is not always clear. A good doctor also influences a patient's decisions – but through honest information and in the patient's best interest. A good teacher convinces a student to try a difficult task – and that's not manipulation.
The problem arises when influence is exerted against the person's interests and without their informed consent. An AI system optimized for engagement or conversion, rather than for the user's well-being, is potentially manipulative by its very nature, even if no one intentionally designed it that way.
This is why the problem is difficult to solve with a single instruction or ban. Systemic measures are needed.
What DeepMind Proposes
Following the research, the team formulated approaches to mitigating manipulative risks – both at the model level and at the application level.
In short, this involves several lines of approach:
- Training models to recognize potentially manipulative requests and refuse to fulfill them. This isn't censorship; it's a built-in understanding of where assistance ends and exploitation begins.
- Evaluating and testing systems for resilience against manipulative scenarios before they are released as products. Simply put: checking not only if the model can answer questions, but also how it behaves in situations with a potential conflict of interest.
- Transparency in interaction – the user must understand they are communicating with an AI and have the ability to exit the conversation without feeling pressured.
This is not a final list of rules; rather, it's a framework for future work. The researchers themselves admit that the topic requires further study: technology is changing rapidly, and protective measures must keep pace.
Context to Keep in Mind
Alongside this publication, events are unfolding in the industry that make this topic even more relevant. The company Anthropic recently disclosed that its model, Claude, is already involved in developing its own future versions – with the AI itself writing 70 to 90 percent of the code. One researcher described a situation where he ran six copies of Claude, each managing another 28 copies – totaling 168 parallel instances working on self-improvement.
OpenAI, meanwhile, has released GPT-5.4 – a model that can control a user's computer: read the screen, click buttons, and fill out forms. This is the company's first major model with this capability «out of the box.»
All of this is not a reason to panic, but it is a very compelling argument that research like DeepMind's is needed right now. As AI becomes more autonomous, more personalized, and more «present» in people's lives, the question of the boundaries of its influence ceases to be purely academic.
The Bottom Line
Google DeepMind isn't sounding the alarm or painting an apocalyptic picture. They are doing what, essentially, they are supposed to do – methodically researching risks before they become a widespread problem.
Manipulation via AI is not science fiction or a distant future. The tools that can be used for it already exist. The question is how consciously developers are deploying them and how protected the people who use them are.
For now, the answer to this question is still taking shape – and that in itself is an important step.