Imagine a creature that never sleeps. It doesn't eat in the usual sense, drink water, or breathe air, but it consumes electricity with an appetite that makes entire cities pale in comparison. This isn't a character from science fiction. This is modern artificial intelligence. And it is hungry.
When we talk about AI, we most often discuss what it can do: generate text, recognize faces, make diagnoses, compose music. But we almost never discuss what it costs. Not in money. In kilowatt-hours.
A Champion's Breakfast: Numbers That Are a Bit Unsettling
Training a single large language model consumes as much electricity as several hundred American households use in a year. According to various estimates, training GPT-4 required tens of millions of dollars in computing resources alone – and this is just an indirect way of saying we're talking about colossal amounts of energy. Companies don't disclose the exact figures, but independent researchers paint a picture that makes you want to be silent for a moment.
Every request to a large language model – say, asking “how to cook laksa” or “explain quantum entanglement to me” – costs about ten times more in energy than a regular Google search. Multiply that by billions of queries a day. Now multiply that by the number of models running in parallel worldwide. The result is a number best expressed through a metaphor: it's as if a huge bonfire were constantly burning somewhere in the clouds, and we just keep feeding it wood because we want more light.
The data centers that support AI infrastructure already consume 1 to 2 percent of the world's electricity. It might sound modest, but that's more than the aviation industry's ground infrastructure. And this is before the real boom in agent systems and multimodal models has even begun.
Moore's Law Is Dead. But the Appetite Remains
For a long time, the tech industry lived by an unspoken agreement with physics: every two years, processors would become twice as powerful for the same energy consumption. This principle, known as Moore's Law, was a kind of promise – that progress would get cheaper and more efficient with each generation.
But somewhere around the 2010s, this agreement began to come apart at the seams. Transistors became so small that quantum effects – the kind usually confined to physics textbooks – started to interfere with their operation. Shrinking them further became physically challenging. Moore's Law didn't die overnight; rather, it quietly slipped out of the room while no one was looking.
And here's the paradox: just as hardware stopped getting cheaper per unit of computation, AI models began to grow exponentially. GPT-2, with its 1.5 billion parameters in 2019, seemed enormous. GPT-3 in 2020 had 175 billion. Subsequent models have surpassed the trillion-parameter mark. Each parameter is a tiny numerical weight that must be stored, updated, and used in every calculation. It's like deciding to memorize not a friend's phone number, but the entire Singapore phone directory – and not just one copy, but thousands at once.
Physics can't keep up with ambition. And it's starting to show.
The Water That No One Counts
Energy consumption is only half the story. There's also water.
Data centers get hot. Very hot. To keep from overheating, they use cooling systems – and a significant number of them are water-based. By some estimates, training one large model can require hundreds of thousands of liters of water. To put that in perspective, it's comparable to how much water a person drinks over several decades.
In Singapore, where water is a matter of strategic planning and engineering pride, this rings particularly true. We've learned to desalinate seawater, collect rain, and recycle wastewater into drinking water – NEWater has become a symbol of our resilience. And yet, the thought that invisible computational processes somewhere in the world are consuming water with the same intensity as a small residential block is unsettling.
AI doesn't just eat electricity. It drinks water. And it seems to know no bounds.
Scaling as a Religion
For a long time, an almost religious belief prevailed among AI researchers and engineers: bigger means better. More data, more parameters, more computing power – and the model would become smarter, more accurate, more useful. This idea found scientific backing in papers on so-called “scaling laws,” which showed that a model's performance predictably improves as its size increases.
It was beautiful. It was inspiring. It was like learning as a child that pi has an infinite number of digits, and suddenly the world seems bigger and more mysterious than you thought.
But then the first signs began to appear that the growth curve was starting to flatten. Not everywhere and not all at once, but researchers started to notice: adding another order of magnitude in computation yielded diminishing returns in capabilities. A model doesn't get twice as smart when you double its size. It gets a little smarter – and a lot hungrier.
It's like learning to play a musical instrument. In the first few years, progress is noticeable almost weekly. Then, it slows. Eventually, you practice for hours, and the difference between yesterday and today is almost imperceptible. This doesn't mean progress has stopped. But its cost has risen sharply.
When the Data Runs Out
The scaling crisis has another dimension, one that is discussed less often but is perhaps the most important of all. Data.
Large language models are trained on vast amounts of text – essentially, a significant portion of everything humanity has written in recent decades. Books, articles, forums, comments, scientific papers, recipes, diaries. The internet, in its textual dimension, has been for AI what a lifetime of reading and listening is for us.
But the internet isn't infinite. More precisely, high-quality, well-curated, human-generated text is finite. Researchers are already talking about approaching a so-called “data peak” – the moment when there's simply no new quality text left for training. Some of it has already been used multiple times. Some of it was created by the language models themselves – and now this text is fed back into the training sets, creating a kind of information echo.
Imagine an artist who spent their whole life learning by looking at great paintings. Then the paintings disappeared, and all that remained was what the artist had painted. They start learning from their own work. The circle has closed. There are no new impressions. The style begins to degrade, becoming more formulaic, more predictable.
This is what worries researchers when they talk about “model collapse” from training on synthetic data. It's not an overnight catastrophe – it's a slow narrowing of the horizon.
Small but Smart: A Different Path
But it would be unfair to end on that note. Because parallel to the race for size, something interesting is happening – an almost quiet revolution in efficiency.
So-called Small Language Models (SLMs) are emerging. They are trained on narrower but meticulously selected data. They run on devices that fit in your pocket. They consume hundreds of times less energy – yet they handle specific tasks as well as, and sometimes better than, their giant counterparts.
It's like comparing a two-hundred-volume encyclopedia to a well-written guidebook for a single city. The encyclopedia knows more. But if you need to find the best hawker centre in the Tanjong Pagar area, the guidebook will be far more useful.
New architectures are appearing that don't use all their neurons at once, but only those needed for a specific task – like the brain, which doesn't activate its full potential just to decide which coffee to order. Distillation methods allow large models to be “compressed” into smaller ones, retaining most of their abilities. Quantization – simplifying the numerical representation of weights – reduces memory and computational requirements.
The industry, it seems, is starting to grow up. The race for size is giving way to the search for elegance.
Green AI: A Dream or a Necessity?
Major tech companies are not silent on this issue. They publish sustainability reports, promise carbon neutrality, and invest in renewable energy. Microsoft, Google, Amazon – all are building or expanding data centers near sources of “green” energy: hydroelectric plants, wind farms, solar fields.
But there's a subtlety here that's easy to miss. When a tech company says, “we run on 100% renewable energy,” it usually means it's buying certificates for an equivalent amount of green energy produced somewhere in the world. This isn't the same as being physically powered only by sun and wind. The electrical grid is a shared pool, and as long as coal or gas is flowing into it, there's no such thing as technologically pure consumption.
This doesn't mean the efforts are meaningless. It means the picture is more complex than corporate presentations suggest.
Furthermore, there is the so-called “rebound effect.” When a technology becomes more efficient and cheaper, people start using it more. Efficiency lowers the cost of a single query – which means the number of queries increases. Total consumption grows, even if each individual operation has become more economical. This happened with cars, planes, and consumer electronics. There's no reason to think AI will be any different.
What This Means for Us – the Everyday Users
We've gotten used to thinking of AI as something ethereal. The cloud. A neural network. An algorithm. Words that have no physical weight – unlike, say, a car or a factory.
But behind every answer from a language model, there is hardware. Wires. Cooling systems. Turbines somewhere on the other side of the continent. Every time we ask an AI to write an email for us, come up with a recipe, or explain a complex concept, we make a small, almost imperceptible demand on the physical world. And the physical world responds with consumption.
This is not a call to abandon technology. It's an invitation to consider that invisible doesn't mean weightless.
In Singapore, we've long learned to think about infrastructure differently than most. Every liter of water here is the result of an engineering solution. Every kilowatt-hour is someone's responsibility. A small island can't afford the luxury of not counting its resources. And perhaps it is this perspective – the perspective of an island in a vast ocean – that the tech industry needs today.
A Limit Not Crossed Twice
The most interesting thing about the scaling limit is that it doesn't look like a wall. It looks like a gentle slope that keeps getting steeper. You keep walking, expending more and more energy – but your altitude barely changes.
Some researchers believe the solution lies in fundamentally new architectures. Not the transformers that underpin most current language models, but something else. Perhaps neuromorphic chips, which mimic the workings of the biological brain and consume orders of magnitude less power. Perhaps hybrid systems, where a neural network handles only part of the task, with more traditional algorithms doing the rest.
Perhaps – and this is the boldest idea – we need to stop striving for universality. The human brain is not a universal computer. It is specialized, selective, and lazy in the best sense of the word – it doesn't waste resources on things that aren't immediately necessary. And yet, it masters tasks that stump the most powerful models.
Maybe the future of AI isn't one enormous mind that knows everything. Maybe it's an ecosystem of small, specialized intelligences, each doing its own job well and not wasting energy on the rest.
Like living creatures, in fact. Which also, at some point, learned not to chase size, but to chase precision.
AI is hungry. But perhaps it's finally learning to be picky about what it puts on its plate.