3 Common Misconceptions about Prediction and Forecasting (and what we can do about it)

1. Prediction is about understanding the future

We humans have a lot of difficulty understanding the subtleties of time. It is important to remember how little we intuitively understand about the nature of time when building or interpreting forecasts and predictive modelling. Whilst I have built models that, for example, predict a customer’s propensity to churn reasonably well, weather forecasts for a given locality might at best predict only a few days into the future and this is the best  we can do even with the most powerful predictive models ever built. The difference is that the former “predicts” human behaviour whereas the second tries to peer into the future of a complex stochastic system. Predictive modelling works best when trying to predict human behaviour because it is a human invention bounded by human experience. Modelling does not predict a priori. I prefer to think of predictive modelling as projected behavioural modelling. Prediction sounds too good to be true. Traditional forecasting tries to project past trends and says if recent past conditions prevail what will the future look like? This is a fundamental misunderstanding of the nature of time. We have seen this breakdown significantly in recent time with energy consumption forecasting. There have been a range of significant disrupters such as the  global financial crisis, new appliance technology, distributed generation and changing housing standards to name a few. Some of these thing have been foreseeable and others have not, but none of them appear in the past record of energy consumption which is the prerequisite for a traditional forecast model.

2. My forecast is correct therefore my assumptions are correct

Just because a given forecast comes to pass does not mean that the model is without flaw. I am reminded of both Donald Rumsfeld and Edward Lorenz in debunking this. Lorenz discovered patterns that are locally stable and may replicate itself for a period of time, but are guaranteed not do so indefinitely. This is at the heart of chaos theory and every good modeller should understand this. The conditions which causes patterns to break down are sometimes what Rumsfeld refers to as unknown unknowns. There’s not much we can do about those except to try and imagine them or else be agile enough to recognise them once they start to unfold. But there are also “known unknowns” those things which we know we don’t know.

3. My forecast was correct given the data we had at the time

My golden rule is that all forecasts are wrong – they are just wrong in different ways. Sometime the biggest problem is when a forecast or prediction comes to pass. If a forecast comes to pass it is not knowable to what extent the success was subject to the efficacy of the model. I am reminded of Tarot readings. It is easiest to convince someone about a prediction when it confirms the observers own bias and none of us are without bias. And there is always a get out clause if the model does not continue to predict well. This is relatively harmless if the prediction is about the likelihood of meeting a tall, handsome stranger, but more significant if it is a prediction about network energy consumption. In the case of the latter its not good enough to say that was the best we could do at the time.

So what can we do about it?

The reason we build forecasts is to provide an evidential basis for decision-making that minimises risk. It is therefore a crazy idea that major investment decisions can be made on a single forecast. It is like putting your entire superannuation on black on the roulette wheel. The first step in reducing risk in prediction and forecasting is to try and understand (or imagine) the range of unknowns that may occur. For example, we know that financial crises, bushfire and floods all occur and we have some idea of how extreme they might be. We even have a pretty good idea of their probability of occurrence. We just don’t know when they will occur. In terms of energy forecasting we know certain disrupters are unfolding now such as weather variability and distributed generation but we don’t quite know how fast and to what extent. Known knowns and known unknowns.

While we don’t know what will happen in the future we do have a pretty good understanding of how different populations will behave under certain conditions. The solution therefore is to simulate population behaviours under a range of scenarios to get an understanding of what might happen at the extremes of the forecast rather that relying on the average or “most likely” forecast.

The answer in my opinion is multiple simulation. Instead of building one forecast or prediction, we build a range of models either with different assumptions, different methodologies or (preferably) both. That way we can build a view of the range of forecasts and associated risks.  What I need to know is not what the weather is going to do, but whether I should plan a camping trip or not. Multiple simulated prediction gives us the tools we need to do what we as humans do best – make decisions based on complex and sometimes ambiguous information.

Energy consumption, customer value and retail strategy

I am sometimes surprised at the amount of effort that goes into marketing electricity. I can’t help but feel that a lot of customer strategy is over engineered. So here I present a fairly straightforward approach that acknowledges that energy is a highly commoditised product. This post departs a little from the big themes of this blog but is still relevant because the data available from smart meters makes executing on an energy retail strategy a  much more interesting proposition (although still a challenging data problem).

To start with let’s look at the distribution of energy consumers by consumption. This should be a familiar distribution shape to those in the know:

Energy Consumption Distribution

In effect what we have are two distributions overlayed: a normal distribution to the left overlaps with a Pareto distribution to the right. This first observation tells us that we have two discrete populations with the own rules governing the distribution of energy consumption. A normal distribution is a signature of human population characteristics and as such identifies what is commonly termed the electricity “mass market” essentially dominated by domestic households. The Pareto distribution to the right is typical of an interdependent network such as a stock market where a stock’s value, for example, is not independent of the value of other stocks. This is also similar to what we see when we look at the distribution of business sizes.

A quick look at the distribution of electricity consumption allows us to define two broad groups and because consumption is effectively a proxy for revenue we have a valuable measure in understanding customer value.

In our Pareto distribution we have a long tail of an ever decreasing number of customers with increasingly large consumption (and therefore contribution to revenue). To the left we have the largest number of customer but relatively low value (although mostly better that the customers at the top end of the normal distribution) and to the right a very few “mega-value” customers. We can therefore roughly define three “super-segments” as follows:

Energy Consumption Super Segments

With VLC on the right revenue is king. Losing just a few of these customers will impact overall revenue so the strategy here is to retain at all costs. At the extreme right for example individual relationship management is a good idea as is bespoke product design and pricing. To the lower end of this segment a better option may be relationship managers with portfolios of customers. But the over-riding rule is 1:1 management where possible.

The middle segment is interesting in that both revenue and margin are important. Getting the balance right between these two measures is very important and the strategy depends on whether your organisation is in a growth or retain phase.  If I was a new market entrant this is where I would be investing a lot of my energy. This is the segment of the market where some small wins could build a revenue base with good returns relatively quickly assuming that the VLC market will be fairly stable and avoids the risks inherent in the mass market. On the flip side, if I was a mature player then I would be keeping a careful eye on retention rates and making sure I had the mechanisms to fine tune the customer value proposition. An example might be offering “value-add” services which become possible with advanced metering infrastructure such as online tools which allow business owners to track productivity via portal access to real time energy data; or the ability to upload their own business data which can be merged and visualised with energy consumption data.

The mass market is really the focus of most retailers because often success metrics focus too heavily on customer numbers rather than revenue and margin, probably because this is easier to measure. The trap is that these customers have a high degree of variable profitability as described by the four drivers of customer lifetime value:

Customer Lifetime Value Drivers

Understanding these drivers and developing an understanding of customer lifetime value is critical to developing tailored engagement strategies in this segment. Because these customers are the easiest to acquire, a strategy based around margin means that less profitable customers will be left for competitors to acquire. If those competitors are still focussed on customer counts as their measure for success then they will happily acquire unprofitable customers which in time will increase pressure to acquire even more because of falling margins. Thus the virtual circle above is replaced with a vicious cycle (thanks to David McCloskey for that epithet).

And so there we have the beginnings of a data driven customer strategy. There is of course much more to segmentation that this and there now very advanced methodologies for producing granular segmentation to help execute on customer strategy and provide competitive advantage.  I’ll touch on these in future posts. But this is a good start.