«I wrote this article because I've long been intrigued by a question: can you teach a machine to think rationally if it learned from texts written by irrational humans? The result turned out to be predictably paradoxical – artificial intelligence absorbed our biases with frightening accuracy. Now I wonder what the readers will think: will they see a mirror of their own weaknesses in this, or will they insist that they, at least, are definitely more rational than an algorithm?» – Professor Emile Dubois
Picture the scene: you're standing at the checkout counter in a supermarket, you notice milk has gone up by twenty percent, and at that very moment, the radio announces that inflation in the country is just three percent. Who do you believe? Your wallet or the announcer's voice? If you're like most people – and as it turns out, like most language models – you'll believe your wallet. It's not just a human frailty; it's a fundamental feature of how we process information about the future.
A recent study conducted by a group of scientists examining the behavior of large language models as economic agents has discovered something surprising: artificial intelligence forms expectations about the future almost as irrationally as we do. When GPT-4 was asked to play the role of a household head or a company CEO, the model systematically deviated from what economists call “rational belief updating.” It overestimated some information, underestimated other data, and behaved as if it possessed biases, emotions, and a limited capacity to process multiple data streams simultaneously.
How Language Models Learn to Form Economic Expectations
A Collective Hallucination That Learns from Its Mistakes
Money is a collective hallucination, as I like to repeat. But expectations regarding money are an even subtler hallucination. We all live inside a cloud of assumptions about what will happen to prices, wages, profits, and interest rates. And this cloud determines our decisions: whether to buy a new car, take out a loan, or invest in stocks. Central banks spend billions trying to manage this cloud; yet until now, no one knew exactly how people form their expectations.
Now, large language models enter the stage. Trained on terabytes of text – from news and scientific articles to forums and blogs – they have learned to mimic human reasoning with frightening accuracy. Researchers decided to use them as lab subjects: to create a controlled environment where information can be precisely dosed, and changes in expectations observed. This is something impossible to do with real people without huge costs and ethical compromises.
The Kalman Filter in AI Expectation Formation
The Kalman Filter: When Mathematics Meets the Human Psyche
To understand what happens inside the “head” of a language model, scientists used a tool called the “Kalman filter.” This is a mathematical construct developed back in the 1960s for spacecraft navigation tasks. Its essence is simple: you have a prior belief about something (for example, that inflation will be four percent), then you receive a new signal (news that prices rose by five percent), and you need to update your belief by combining the old and the new.
In an ideal world, a rational agent would weigh these two elements optimally: if the new signal is reliable, it gets more weight; if it is noisy and unreliable, it gets less. The sum of all weights must equal one, just like in a well-balanced portfolio. But people – and, as it turned out, language models – are not ideal. They overestimate one thing, underestimate another, and sometimes simply ignore part of the information because they are tired or distracted.
The behavioral Kalman filter is the exact same mathematical scheme, but with the addition of “human” distortions. It allows weights to deviate from one, signals to receive different significance, and agents to behave as they do in reality: irrationally, egocentrically, and with limited attention.
The Experiment: Households vs. CEOs
The researchers conducted two sets of experiments. In the first, the language model played the role of an ordinary household trying to predict future inflation. It was given an initial belief – say, that inflation would be three percent. Then it was provided with two types of signals: individual (for example, “you noticed that food prices in your area rose by seven percent”) and aggregated (for example, “the central bank reported that national inflation is four percent”).
In the second set of experiments, the model acted as a company CEO forming expectations about future sales. It was also given an initial belief, an individual signal (internal firm sales data), and an aggregated signal (general economic forecasts or industry trends).
Scientists varied the magnitude and direction of the signals, created situations where they contradicted each other, and observed how the model updated its expectations. Then they extracted the numbers and estimated the parameters of the behavioral Kalman filter using the least squares method – a classic statistical approach.
Key Findings About AI Economic Behavior
Four Discoveries That Change the Picture
First: The Sum of Weights Simply Does Not Equal One
In the ideal world of a rational agent, the weight of the past belief plus the weight of the new signal should sum to one. This means all information is accounted for correctly, without skewed perception. But language models systematically deviated from this rule. Households overreacted: the sum of weights was greater than one, indicating overconfidence or excessive sensitivity to news. CEOs were closer to the norm but still not perfect – sometimes they underreacted; sometimes they also overreacted.
This is not just a technical detail. It means that language models, like people, do not know how to correctly weigh information. They either sway too heavily in the wind of news or hold too stubbornly to old beliefs.
Second: Personal Experience Beats Statistics
The most striking pattern: both households and CEOs attached significantly more weight to individual signals than to aggregated ones. Sometimes two to three times more. If the model in the household role saw milk prices go up by ten percent, it believed that more than the official report of three percent inflation. If the model in the CEO role received internal data about a drop in sales, it reacted to that more strongly than to macroeconomic growth forecasts.
Why is this important? Because it explains why central banks so often fail to manage expectations. They publish beautiful charts and tables, hold press conferences, and make statements. But the average person believes their wallet more than the words of the central bank chief. And the language model, trained on texts written by humans, has absorbed this very same pattern.
Third: When There Are Many Signals, Each Loses Power
Researchers discovered a negative interaction between simultaneous signals. This means that when the model received both an individual and an aggregated signal at the same time, each received less weight than if it were presented in isolation. The presence of multiple information sources did not reinforce the overall effect – it diluted it.
This resembles cognitive overload: when too much disparate information falls upon you, you cannot process it all effectively. You start to ignore part of the data or reduce trust in each individual source. Language models demonstrate the exact same weakness. This is “attention dilution” – an effect that has direct consequences for communication policy. If you bombard an audience with many different messages simultaneously, you may inadvertently reduce the impact of each one.
Fourth: Households and CEOs Think Differently
Although the general patterns were similar, substantial differences were found between households and CEOs. Households were more emotional, more sensitive to personal experience, and more prone to overreaction. CEOs behaved with more restraint, more weightily, closer to the rational norm. Their interaction effect between signals was weaker.
This makes sense too: the role of a company head requires a more analytical approach, the ability to account for both internal and external factors. A household lives in a world of immediate experience – prices for milk, gasoline, rent. It has neither the time nor the resources to dive into macroeconomic reports.
Can Artificial Intelligence Be Cured of Irrationality?
Scientists asked the question: if language models demonstrate behavioral biases, can they be “fixed”? They took the GPT-4 model and applied a technique called LoRA – Low-Rank Adaptation, which allows fine-tuning the model based on a new dataset without full retraining. The model was trained on examples of more rational expectation updates, where weights were closer to optimal values.
The results were encouraging but not perfect. The fine-tuning indeed smoothed out some distortions: the sum of weights approached one, and the preference for individual signals decreased. But it was impossible to eliminate all deviations completely. The negative interaction effect remained significant, although it weakened. This suggests that some patterns are deeply rooted in the model's architecture or implicitly in the structure of the training data.
In other words, you can make artificial intelligence slightly more rational, but you cannot completely purge it of the very same weaknesses inherent in the people who wrote the texts on which it was trained.
What AI Biases Reveal About Human Economic Behavior
A Mirror That Reflects Our Fears
What does all this mean for us – for people living in a world where artificial intelligence is becoming an increasingly important player in the economy? First, it means that language models are not perfectly rational agents. They will not replace human biases with purer logic. They reproduce our own distortions because they are trained on our culture, our history, and our texts.
Second, this sheds light on why economic policy so often fails to work as intended. If even artificial intelligence, devoid of emotions and biological limitations, still prefers personal experience to official statistics, what can be said about real people? Central banks can publish any number of reports about low inflation, but if people have less money in their wallets, they won't believe these reports. And no amount of press conferences will change that.
Third, the information dilution effect reminds us of the danger of overload. We live in an era of information abundance, where thousands of signals fall upon us every day – news, forecasts, expert opinions, commentary on social media. This study shows that more information does not always mean better. Sometimes, an excess of data simply lowers our trust in each individual source and hinders the making of weighed decisions.
Why Personal Experience Trumps Statistics in Economic Decisions
History Repeats Itself, Only the Platforms Change
I have been studying money and trust for several decades, and I have always been struck by how persistent certain human patterns are. In the seventeenth century, the Dutch believed tulip bulbs would rise in price forever because they saw their neighbors getting rich from it. In the twentieth century, Americans believed real estate prices would never fall because they saw the value of their own homes rising. In the twenty-first century, people believe in Bitcoin because they see success stories on the internet.
All of these are variations of the same phenomenon: trust in personal experience and stories close to us is stronger than trust in abstract statistics. Language models have absorbed this pattern from billions of texts written by humans over centuries. They didn't invent it – they simply reproduce it with mathematical precision.
The study of the behavioral Kalman filter for language models is not just a technical exercise. It is a mirror that shows us how we actually think about economics, money, and the future. And this mirror is not particularly flattering. We are irrational, egocentric, and subject to cognitive limitations. But we are also adaptive, capable of learning and changing – albeit not completely.
Future Research on AI Economic Expectations
What Next?
Of course, this research has limitations. Language models are not people. Their “brains” work completely differently, and their reactions may reflect training artifacts rather than true cognitive processes. The behavioral Kalman filter is a simplification that does not capture the full complexity of human thinking.
But this is a start. Future studies might compare the behavior of language models with real expectations survey data to check how accurately they mimic human biases. One could study how different phrasings and contexts influence the formation of expectations. One could apply these methods to other areas of economic decisions – pricing, investments, career choice.
The most interesting part is that language models allow for experiments that are impossible with real people. One can create counterfactual scenarios, test thousands of information policy variants, and observe long-term effects in compressed timeframes. This opens a new era in experimental economics – an era where the lab subjects are not students working for a small fee, but algorithms capable of mimicking millions of different agents.
Money has always been a collective hallucination. Now we have artificial hallucinators that can help us understand why we believe what we believe, and why we are so often mistaken in our expectations regarding the future. It won't make us completely rational – but perhaps it will make us slightly more self-aware.
And that is already quite something.