3 Common Misconceptions about Prediction and Forecasting (and what we can do about it)

1. Prediction is about understanding the future

We humans have a lot of difficulty understanding the subtleties of time. It is important to remember how little we intuitively understand about the nature of time when building or interpreting forecasts and predictive modelling. Whilst I have built models that, for example, predict a customer’s propensity to churn reasonably well, weather forecasts for a given locality might at best predict only a few days into the future and this is the best  we can do even with the most powerful predictive models ever built. The difference is that the former “predicts” human behaviour whereas the second tries to peer into the future of a complex stochastic system. Predictive modelling works best when trying to predict human behaviour because it is a human invention bounded by human experience. Modelling does not predict a priori. I prefer to think of predictive modelling as projected behavioural modelling. Prediction sounds too good to be true. Traditional forecasting tries to project past trends and says if recent past conditions prevail what will the future look like? This is a fundamental misunderstanding of the nature of time. We have seen this breakdown significantly in recent time with energy consumption forecasting. There have been a range of significant disrupters such as the  global financial crisis, new appliance technology, distributed generation and changing housing standards to name a few. Some of these thing have been foreseeable and others have not, but none of them appear in the past record of energy consumption which is the prerequisite for a traditional forecast model.

2. My forecast is correct therefore my assumptions are correct

Just because a given forecast comes to pass does not mean that the model is without flaw. I am reminded of both Donald Rumsfeld and Edward Lorenz in debunking this. Lorenz discovered patterns that are locally stable and may replicate itself for a period of time, but are guaranteed not do so indefinitely. This is at the heart of chaos theory and every good modeller should understand this. The conditions which causes patterns to break down are sometimes what Rumsfeld refers to as unknown unknowns. There’s not much we can do about those except to try and imagine them or else be agile enough to recognise them once they start to unfold. But there are also “known unknowns” those things which we know we don’t know.

3. My forecast was correct given the data we had at the time

My golden rule is that all forecasts are wrong – they are just wrong in different ways. Sometime the biggest problem is when a forecast or prediction comes to pass. If a forecast comes to pass it is not knowable to what extent the success was subject to the efficacy of the model. I am reminded of Tarot readings. It is easiest to convince someone about a prediction when it confirms the observers own bias and none of us are without bias. And there is always a get out clause if the model does not continue to predict well. This is relatively harmless if the prediction is about the likelihood of meeting a tall, handsome stranger, but more significant if it is a prediction about network energy consumption. In the case of the latter its not good enough to say that was the best we could do at the time.

So what can we do about it?

The reason we build forecasts is to provide an evidential basis for decision-making that minimises risk. It is therefore a crazy idea that major investment decisions can be made on a single forecast. It is like putting your entire superannuation on black on the roulette wheel. The first step in reducing risk in prediction and forecasting is to try and understand (or imagine) the range of unknowns that may occur. For example, we know that financial crises, bushfire and floods all occur and we have some idea of how extreme they might be. We even have a pretty good idea of their probability of occurrence. We just don’t know when they will occur. In terms of energy forecasting we know certain disrupters are unfolding now such as weather variability and distributed generation but we don’t quite know how fast and to what extent. Known knowns and known unknowns.

While we don’t know what will happen in the future we do have a pretty good understanding of how different populations will behave under certain conditions. The solution therefore is to simulate population behaviours under a range of scenarios to get an understanding of what might happen at the extremes of the forecast rather that relying on the average or “most likely” forecast.

The answer in my opinion is multiple simulation. Instead of building one forecast or prediction, we build a range of models either with different assumptions, different methodologies or (preferably) both. That way we can build a view of the range of forecasts and associated risks.  What I need to know is not what the weather is going to do, but whether I should plan a camping trip or not. Multiple simulated prediction gives us the tools we need to do what we as humans do best – make decisions based on complex and sometimes ambiguous information.