Published on December 10, 2025

Conformal Optimistic Prediction: Advanced Prediction Intervals for Unstable Data

How to Teach an Algorithm Not to Panic: The Story of Prediction Intervals That Think Ahead

A new method constructs tight uncertainty intervals by leveraging data structure, all while maintaining accuracy guarantees even when things go off the rails – and it's a total game-changer.

Mathematics & Statistics 11 – 16 minutes min read

Author: Professor Lars Nielsen 11 – 16 minutes min read

Imagine you are a meteorologist in a medieval town. Every morning, the townspeople ask: «Will it rain?» You can say «yes» or «no» – and sometimes be wrong. But there is another way: say «there is a 70% probability it won't rain, but I am 90% sure that if it does, between 2 and 15 millimeters will fall». This isn't just a forecast anymore – it's an uncertainty interval. And these are exactly the kinds of intervals built by conformal prediction – an elegant mathematical technique that doesn't try to guess the exact number, but honestly says: «The true value lies somewhere in here, and I guarantee this with the level of confidence you need».

Sounds wonderful, right? But the devil, as always, is in the details 😈

When Data Changes: Challenges for Prediction Intervals

When the World Changes Faster Than Your Formulas

Classic conformal prediction works brilliantly if the data behaves itself – arriving in random order and obeying the exact same probability distribution. In reality, however, the world prefers chaos. A stock price suddenly collapses due to a billionaire's tweet. An epidemic shifts electricity consumption patterns. The climate becomes increasingly unpredictable.

In such conditions, traditional methods start to panic. They see that the data has «shifted» and react in the only way they know how – they widen the intervals. Drastically. So much so that your forecast turns from «the temperature will be between 18 and 22 degrees» into «somewhere between minus infinity and plus infinity». Technically correct, but absolutely useless.

Why does this happen? Because existing online conformal prediction methods are tuned for an adversarial scenario – they assume the worst. They think: «What if tomorrow the data stops obeying any laws whatsoever? Better play it safe and make the interval huge». It's as if you always carried an umbrella, a raincoat, sunglasses, and skis simultaneously – just in case the weather decides to show you everything at once.

Optimism as a Mathematical Strategy

Here enters a new approach – Conformal Optimistic Prediction, or COP. Its philosophy is simple yet paradoxical: let's be optimistic, but not reckless.

COP says: «Look, I see patterns in the data. Yes, they aren't perfect, yes, they might break at any moment – but while they are working, why build intervals the size of a football field»? And it adds an insurance policy: «But if the patterns turn out to be an illusion, I still guarantee correct coverage.»

Sound like an ad for a financial product that's too good to be true? Let's break down how it actually works.

Anatomy of a Smart Interval

To understand COP, we first need to understand how prediction intervals work in online mode generally.

Imagine a stock trader who adjusts their strategy every second. A new stock price comes in – they check: did it fall within their predicted interval? If not – the interval was too narrow, time to widen it. If yes – maybe we can narrow it a bit to make the forecast sharper.

Mathematically, this looks like updating a threshold according to the rule:

new_threshold = old_threshold + step × (error − target_level)

Here, «error» is either 0 (if we hit the interval) or 1 (if we missed), and «target level» is how many misses we are willing to tolerate. If we need 90% coverage, our target error level is 10%.

This scheme is genius in its simplicity. But it has a fundamental problem: it is reactive. It is always one step behind reality. Like a driver who only corrects the steering wheel after the car has already started drifting off the road.

Peeking into the Future Not Cheating But Mastery

Peeking into the Future – Not Cheating, But Mastery

COP proposes adding a proactive component. It's as if our driver didn't just react to the skid, but also looked at the curve of the road ahead, felt the force of the crosswind, and corrected the trajectory in advance.

How does the algorithm «see the future»? It's not psychic – it simply uses the data structure. If the last 100 points show a certain distribution of model errors, there is a chance the next point will come from that same distribution. Not a guarantee – but a reasonable bet.

Formally, COP does the following:

First, it updates the threshold the classic way – this is the base protection.
Then, it estimates where this threshold stands relative to the expected distribution of errors.
Finally, it adjusts it in the direction that seems more reasonable given the observed patterns.

The key moment: if the estimate of the distribution turns out to be dead wrong – step one still ensures correct coverage. But if the estimate is even partially correct – the intervals become significantly narrower.

Connection to Optimistic Gradient Descent

Here, mathematics meets the philosophy of optimism. In machine learning, there is a family of methods called optimistic online gradient descent. Their idea: if a data sequence has a predictable structure, one can take bolder steps and converge faster.

COP is built in a similar way. The interval threshold update can be interpreted as gradient descent for a special loss function – the quantile loss. And the optimistic component acts as a hint regarding where to move next.

Imagine a person walking through fog on uneven terrain. The classic method: take a small step, feel the ground, take another step. The optimistic method: use your memory of what the ground was like a second ago, and step a bit more confidently in the expected direction – but maintain caution in case everything changes abruptly.

The Three Pillars of Theoretical Guarantees

But beautiful analogies are one thing, and mathematical proofs are quite another. COP rests on three strict theorems:

Guarantee One: Finite sample, any distribution

Even if you only have 100 data points, even if the data comes from the weirdest distribution imaginable – COP guarantees that the coverage error will not deviate from the target by more than a magnitude of roughly 1/T, where T is the number of observations.

It's like a manufacturer's warranty: «Our fridge might not be perfect, but the temperature deviation won't exceed one degree – tested in every kitchen in the world».

Guarantee Two: Arbitrary learning rates

You can change the update step dynamically – increase it when things are calm, decrease it during turbulence. COP remains correct. The main thing is not to do it too often or too abruptly.

Guarantee Three: Convergence with independent data

If data really does come from a single distribution independently (the classic i.i.d. case), then with the right choice of learning rate, the COP threshold converges to the true quantile of that distribution. That is, the method isn't just correct – it is also optimal in the long run.

Trial by Reality

Theory is theory, but does it work in practice? The authors tested COP on nine different scenarios – from artificially created tricky sequences to real data on Amazon and Google stocks, electricity consumption, and climate observations.

Scenario One: Sharp Changes

Imagine a time series that suddenly jumps from one level to another – as if the price of Bitcoin decided to double overnight. Classic methods in such situations either give huge intervals just in case, or violate coverage guarantees.

COP handled it: coverage held at the 90% level, as required, while the average width of the intervals turned out to be 15–30% smaller than competitors.

Scenario Two: Distribution Drift

A more insidious situation is when data doesn't jump sharply but shifts slowly. Like global warming: every year the temperature is slightly higher, but not enough to be immediately obvious.

Here, COP's advantage became even more noticeable. The method knows how to «feel» smooth changes in data structure and adapt intervals accordingly without falling into a panic.

Scenario Three: Volatility Change

An especially interesting case is when the mean value remains stable, but the scatter of data increases or decreases. Like stocks during periods of calm versus crisis.

COP performed exceptionally well right here. While competitors churned out intervals that were either always too wide or violated coverage during high volatility, COP adjusted dynamically: wide intervals during rough times, narrow ones when the market slept.

Real Data: When Theory Meets Wall Street

On Amazon and Google stocks, COP demonstrated stable coverage around 89.5–90.5% (reminder: the goal was 90%) with intervals that were 20–40% narrower than most competitors.

Even more impressive results came from electricity consumption data. Here, there is a strong seasonal component (consumption is higher by day, lower by night, winter differs from summer), and COP learned to use this predictability. The intervals ended up being so precise they could be used for real load planning in power grids.

Resilience to Errors: The Main Feature

Resilience to Errors – The Main Feature

The most important experiment is the «fool test». What if we give COP a completely random estimate of the error distribution? For example, instead of analyzing real data, we just flip a coin?

The result is surprising: coverage remained correct! 90%, just as required. The intervals, of course, became wider – no magic here: if the information is useless, optimism won't help. But the main point is that the method didn't break; the guarantees survived.

This is a fundamental difference from many «smart» methods that promise miracles under the right assumptions but turn into a pumpkin if those assumptions are violated. COP is built differently: optimism gives a bonus when it is justified, but doesn't create a catastrophe when it mistakes.

Computational Efficiency: Science Must Be Fast

There is one more practical question: isn't all this too computationally expensive?

No. One iteration of COP takes about 0.01 milliseconds on a standard processor. This means the method easily works in real-time even for high-frequency data – for example, tick streams from an exchange where updates arrive hundreds of times per second.

The main computational task is estimating the error distribution function. But even here there are simple and fast solutions: one can use a sliding window of recent observations, an empirical CDF, or even a rough parametric approximation (like Gaussian). Experiments showed that even a very rough estimate yields improvement compared to a fully conservative approach.

Where Can This Be Applied Right Now

Finance and Trading

Building confidence intervals for asset prices considering changing volatility. Especially valuable for algorithmic trading and risk management. Instead of hoarding huge reserves for unforeseen spikes, one can act more precisely – and earn more.

Energy

Forecasting load on power grids with guaranteed uncertainty intervals. This is critical for balancing production and consumption, especially with the growing share of renewable energy, which is unpredictable in itself (the wind doesn't blow on schedule).

Medicine and Epidemiology

Forecasting hospitalizations, healthcare resource needs, disease spread. Here, uncertainty can cost lives, and it is important not just to make a forecast, but to honestly say how reliable it is.

Climate Research

Estimating intervals for temperatures, precipitation, sea levels. Climate models always contain uncertainty, and COP helps quantify it correctly, taking long-term trends into account.

Industry and IoT

Predictive maintenance of equipment: when will the machine break, how much running time is left before critical wear? Narrow intervals mean fewer false alarms and more efficient planning.

The Philosophy of Optimism in Data

COP is more than just another algorithm. It is a different philosophy of working with uncertainty.

The traditional approach says: «The world is unpredictable, so let's always prepare for the worst». COP answers: «Yes, the world is complex, but it has structure. Let's use it when we can, but not rely on it blindly».

It's like the difference between a paranoid person and a reasonably prudent one. The paranoid person always wears a bulletproof vest – even at home, even at night. The prudent person assesses the situation and takes adequate measures: a vest in a dangerous neighborhood, but not on the couch in front of the TV.

In a sense, COP is the mathematical formalization of common sense. Use knowledge when you have it. But insure yourself when you don't. And most importantly – never confuse the two.

What Next?

The COP method opens up several interesting directions for future research.

First, can we estimate the error distribution even better? Perhaps use neural networks or other advanced machine learning methods for this? Experiments showed that even simple estimates work well, but the ceiling of possibilities is still far off.

Second, how do we adapt COP for multidimensional predictions? When we need to build not a 1D interval, but a multidimensional region – for example, jointly forecasting temperature, humidity, and pressure.

Third, can COP be combined with other approaches to processing non-stationary data – wavelets, adaptive filtering, change detection methods?

And finally, how do we scale the method to massive data volumes – millions and billions of points? COP's computational efficiency is encouraging, but requires further study.

Final Thoughts

Data doesn't lie. But it knows how to whisper in a language that one must learn to hear. And sometimes this language contains hints about the future – not magic predictions, but statistical patterns.

COP has learned to use these whispers without turning them into immutable truths. It is optimistic, but not naive. Confident, but not arrogant. And that is exactly why it works where other methods either panic or become too careless.

In a world where uncertainty is not the exception but the norm, the ability to assess it correctly becomes a critically important skill. Not eliminating uncertainty – that's impossible. Not hiding from it behind infinitely wide intervals – that's useless. But learning to work with it, extracting maximum information while remaining honest with oneself and others – that is the real goal.

COP shows that mathematics can be not only rigorous but also wise. And that sometimes, optimism backed by the right guarantees is not a weakness, but a strength.

See you at the crossroads of data and uncertainty.

#applied analysis #methodology #machine learning #engineering #mathematics #finance #financial risk modeling #uncertainty management

Source: https://arxiv.org/abs/2512.07770v1

Original Title: Distribution-informed Online Conformal Prediction

Article Publication Date: Dec 8, 2025

Original Article Authors : Dongjian Hu, Junxi Wu, Shu-Tao Xia, Changliang Zou

Professor Lars Nielsen View Profile

«The data doesn't lie. But it can whisper in a language you have to learn before you can truly hear it.»

View Profile

I'm Lars, a mathematician who believes numbers make sense to everyone – if you talk with people, not at them. To me, one good graph can be more persuasive than a hundred equations.

Previous Article Can We Teach AI to Create Enzymes on Demand? Next Article When Geometry Sings: How Abstract Spaces Tell Stories Through Curves

Conformal Optimistic Prediction: Advanced Prediction Intervals for Unstable Data

When Data Changes: Challenges for Prediction Intervals

Optimism as a Mathematical Strategy

Anatomy of a Smart Interval

Peeking into the Future Not Cheating But Mastery

Connection to Optimistic Gradient Descent

The Three Pillars of Theoretical Guarantees

Trial by Reality

Scenario One: Sharp Changes

Scenario Two: Distribution Drift

Scenario Three: Volatility Change

Real Data: When Theory Meets Wall Street

Resilience to Errors: The Main Feature

Computational Efficiency: Science Must Be Fast

Where Can This Be Applied Right Now

The Philosophy of Optimism in Data

What Next?

Final Thoughts

Related Publications

How to Teach a Computer to See Uncertainty – A New Lens on Complex Data

Как научить компьютер принимать решения лучше человека?

Как ИИ научился экономить миллионы на электростанциях: практика против теории

From Research to Understanding

Neural Networks Involved in the Process

1. Research Summarization

2. Creating Text from Summary

3. step.translate-en.title

4. Editorial Review

5. Preparing Description for Illustration

6. Creating Illustration