Passion for biomedicine
Teaching talent
Striking simplicity
Imagine you are a meteorologist in a medieval town. Every morning, the townspeople ask: «Will it rain?» You can say «yes» or «no» – and sometimes be wrong. But there is another way: say «there is a 70% probability it won’t rain, but I am 90% sure that if it does, between 2 and 15 millimeters will fall». This isn’t just a forecast anymore – it’s an uncertainty interval. And these are exactly the kinds of intervals built by conformal prediction – an elegant mathematical technique that doesn’t try to guess the exact number, but honestly says: «The true value lies somewhere in here, and I guarantee this with the level of confidence you need.»
Sounds wonderful, right? But the devil, as always, is in the details 😈
When the World Changes Faster Than Your Formulas
Classic conformal prediction works brilliantly if the data behaves itself – arriving in random order and obeying the exact same probability distribution. In reality, however, the world prefers chaos. A stock price suddenly collapses due to a billionaire’s tweet. An epidemic shifts electricity consumption patterns. The climate becomes increasingly unpredictable.
In such conditions, traditional methods start to panic. They see that the data has «shifted» and react in the only way they know how – they widen the intervals. Drastically. So much so that your forecast turns from «the temperature will be between 18 and 22 degrees» into «somewhere between minus infinity and plus infinity». Technically correct, but absolutely useless.
Why does this happen? Because existing online conformal prediction methods are tuned for an adversarial scenario – they assume the worst. They think: «What if tomorrow the data stops obeying any laws whatsoever? Better play it safe and make the interval huge.» It’s as if you always carried an umbrella, a raincoat, sunglasses, and skis simultaneously – just in case the weather decides to show you everything at once.
Optimism as a Mathematical Strategy
Here enters a new approach – Conformal Optimistic Prediction, or COP. Its philosophy is simple yet paradoxical: let’s be optimistic, but not reckless.
COP says: «Look, I see patterns in the data. Yes, they aren’t perfect, yes, they might break at any moment – but while they are working, why build intervals the size of a football field?» And it adds an insurance policy: «But if the patterns turn out to be an illusion, I still guarantee correct coverage.»
Sound like an ad for a financial product that’s too good to be true? Let’s break down how it actually works.
Anatomy of a Smart Interval
To understand COP, we first need to understand how prediction intervals work in online mode generally.
Imagine a stock trader who adjusts their strategy every second. A new stock price comes in – they check: did it fall within their predicted interval? If not – the interval was too narrow, time to widen it. If yes – maybe we can narrow it a bit to make the forecast sharper.
Mathematically, this looks like updating a threshold according to the rule:
new_threshold = old_threshold + step × (error − target_level)
Here, «error» is either 0 (if we hit the interval) or 1 (if we missed), and «target level» is how many misses we are willing to tolerate. If we need 90% coverage, our target error level is 10%.
This scheme is genius in its simplicity. But it has a fundamental problem: it is reactive. It is always one step behind reality. Like a driver who only corrects the steering wheel after the car has already started drifting off the road.
Peeking into the Future – Not Cheating, But Mastery
COP proposes adding a proactive component. It’s as if our driver didn’t just react to the skid, but also looked at the curve of the road ahead, felt the force of the crosswind, and corrected the trajectory in advance.
How does the algorithm «see the future»? It’s not psychic – it simply uses the data structure. If the last 100 points show a certain distribution of model errors, there is a chance the next point will come from that same distribution. Not a guarantee – but a reasonable bet.
Formally, COP does the following:
- First, it updates the threshold the classic way – this is the base protection.
- Then, it estimates where this threshold stands relative to the expected distribution of errors.
- Finally, it adjusts it in the direction that seems more reasonable given the observed patterns.
The key moment: if the estimate of the distribution turns out to be dead wrong – step one still ensures correct coverage. But if the estimate is even partially correct – the intervals become significantly narrower.
Connection to Optimistic Gradient Descent
Here, mathematics meets the philosophy of optimism. In machine learning, there is a family of methods called optimistic online gradient descent. Their idea: if a data sequence has a predictable structure, one can take bolder steps and converge faster.
COP is built in a similar way. The interval threshold update can be interpreted as gradient descent for a special loss function – the quantile loss. And the optimistic component acts as a hint regarding where to move next.
Imagine a person walking through fog on uneven terrain. The classic method: take a small step, feel the ground, take another step. The optimistic method: use your memory of what the ground was like a second ago, and step a bit more confidently in the expected direction – but maintain caution in case everything changes abruptly.
The Three Pillars of Theoretical Guarantees
But beautiful analogies are one thing, and mathematical proofs are quite another. COP rests on three strict theorems:
Guarantee One: Finite sample, any distribution
Even if you only have 100 data points, even if the data comes from the weirdest distribution imaginable – COP guarantees that the coverage error will not deviate from the target by more than a magnitude of roughly 1/T, where T is the number of observations.
It’s like a manufacturer's warranty: «Our fridge might not be perfect, but the temperature deviation won't exceed one degree – tested in every kitchen in the world».
Guarantee Two: Arbitrary learning rates
You can change the update step dynamically – increase it when things are calm, decrease it during turbulence. COP remains correct. The main thing is not to do it too often or too abruptly.
Guarantee Three: Convergence with independent data
If data really does come from a single distribution independently (the classic i.i.d. case), then with the right choice of learning rate, the COP threshold converges to the true quantile of that distribution. That is, the method isn’t just correct – it is also optimal in the long run.
Trial by Reality
Theory is theory, but does it work in practice? The authors tested COP on nine different scenarios – from artificially created tricky sequences to real data on Amazon and Google stocks, electricity consumption, and climate observations.
Scenario One: Sharp Changes
Imagine a time series that suddenly jumps from one level to another – as if the price of Bitcoin decided to double overnight. Classic methods in such situations either give huge intervals just in case, or violate coverage guarantees.
COP handled it: coverage held at the 90% level, as required, while the average width of the intervals turned out to be 15–30% smaller than competitors.
Scenario Two: Distribution Drift
A more insidious situation is when data doesn’t jump sharply but shifts slowly. Like global warming: every year the temperature is slightly higher, but not enough to be immediately obvious.
Here, COP’s advantage became even more noticeable. The method knows how to «feel» smooth changes in data structure and adapt intervals accordingly without falling into a panic.
Scenario Three: Volatility Change
An especially interesting case is when the mean value remains stable, but the scatter of data increases or decreases. Like stocks during periods of calm versus crisis.
COP performed exceptionally well right here. While competitors churned out intervals that were either always too wide or violated coverage during high volatility, COP adjusted dynamically: wide intervals during rough times, narrow ones when the market slept.
Real Data: When Theory Meets Wall Street
On Amazon and Google stocks, COP demonstrated stable coverage around 89.5–90.5% (reminder: the goal was 90%) with intervals that were 20–40% narrower than most competitors.
Even more impressive results came from electricity consumption data. Here, there is a strong seasonal component (consumption is higher by day, lower by night, winter differs from summer), and COP learned to use this predictability. The intervals ended up being so precise they could be used for real load planning in power grids.
Resilience to Errors – The Main Feature
The most important experiment is the «fool test». What if we give COP a completely random estimate of the error distribution? For example, instead of analyzing real data, we just flip a coin?
The result is surprising: coverage remained correct! 90%, just as required. The intervals, of course, became wider – no magic here: if the information is useless, optimism won’t help. But the main point is that the method didn’t break; the guarantees survived.
This is a fundamental difference from many «smart» methods that promise miracles under the right assumptions but turn into a pumpkin if those assumptions are violated. COP is built differently: optimism gives a bonus when it is justified, but doesn’t create a catastrophe when it mistakes.
Computational Efficiency: Science Must Be Fast
There is one more practical question: isn’t all this too computationally expensive?
No. One iteration of COP takes about 0.01 milliseconds on a standard processor. This means the method easily works in real-time even for high-frequency data – for example, tick streams from an exchange where updates arrive hundreds of times per second.
The main computational task is estimating the error distribution function. But even here there are simple and fast solutions: one can use a sliding window of recent observations, an empirical CDF, or even a rough parametric approximation (like Gaussian). Experiments showed that even a very rough estimate yields improvement compared to a fully conservative approach.
Where Can This Be Applied Right Now
Finance and Trading
Building confidence intervals for asset prices considering changing volatility. Especially valuable for algorithmic trading and risk management. Instead of hoarding huge reserves for unforeseen spikes, one can act more precisely – and earn more.
Energy
Forecasting load on power grids with guaranteed uncertainty intervals. This is critical for balancing production and consumption, especially with the growing share of renewable energy, which is unpredictable in itself (the wind doesn’t blow on schedule).
Medicine and Epidemiology
Forecasting hospitalizations, healthcare resource needs, disease spread. Here, uncertainty can cost lives, and it is important not just to make a forecast, but to honestly say how reliable it is.
Climate Research
Estimating intervals for temperatures, precipitation, sea levels. Climate models always contain uncertainty, and COP helps quantify it correctly, taking long-term trends into account.
Industry and IoT
Predictive maintenance of equipment: when will the machine break, how much running time is left before critical wear? Narrow intervals mean fewer false alarms and more efficient planning.
The Philosophy of Optimism in Data
COP is more than just another algorithm. It is a different philosophy of working with uncertainty.
The traditional approach says: «The world is unpredictable, so let’s always prepare for the worst». COP answers: «Yes, the world is complex, but it has structure. Let’s use it when we can, but not rely on it blindly.»
It’s like the difference between a paranoid person and a reasonably prudent one. The paranoid person always wears a bulletproof vest – even at home, even at night. The prudent person assesses the situation and takes adequate measures: a vest in a dangerous neighborhood, but not on the couch in front of the TV.
In a sense, COP is the mathematical formalization of common sense. Use knowledge when you have it. But insure yourself when you don’t. And most importantly – never confuse the two.
What Next?
The COP method opens up several interesting directions for future research.
First, can we estimate the error distribution even better? Perhaps use neural networks or other advanced machine learning methods for this? Experiments showed that even simple estimates work well, but the ceiling of possibilities is still far off.
Second, how do we adapt COP for multidimensional predictions? When we need to build not a 1D interval, but a multidimensional region – for example, jointly forecasting temperature, humidity, and pressure.
Third, can COP be combined with other approaches to processing non-stationary data – wavelets, adaptive filtering, change detection methods?
And finally, how do we scale the method to massive data volumes – millions and billions of points? COP’s computational efficiency is encouraging, but requires further study.
Final Thoughts
Data doesn’t lie. But it knows how to whisper in a language that one must learn to hear. And sometimes this language contains hints about the future – not magic predictions, but statistical patterns.
COP has learned to use these whispers without turning them into immutable truths. It is optimistic, but not naive. Confident, but not arrogant. And that is exactly why it works where other methods either panic or become too careless.
In a world where uncertainty is not the exception but the norm, the ability to assess it correctly becomes a critically important skill. Not eliminating uncertainty – that’s impossible. Not hiding from it behind infinitely wide intervals – that’s useless. But learning to work with it, extracting maximum information while remaining honest with oneself and others – that is the real goal.
COP shows that mathematics can be not only rigorous but also wise. And that sometimes, optimism backed by the right guarantees is not a weakness, but a strength.
See you at the crossroads of data and uncertainty.