Critical Modeling Tradeoffs We Ignore at Our Peril

Having been present at the birth of VisiCalc and dawn of the electronic spreadsheet age, I’ve spent a lot of years working with many different modeling software programs. However, it wasn’t until I started to read research papers published by Dr. Francois Hemez from Los Alamos National Laboratory that I really understood the tradeoffs we face when creating and using quantitative models.

Hemez writes about the challenge of using models to simulate complex physical phenomena, including the effects of nuclear weapons. His critical insight is that there are inescapable tradeoffs between three metrics that are often used to judge the quality of a model.

The first is the extent to which a model can reproduce historical data.

The second metric is the extent to which a model's predictive accuracy is robust to different types of uncertainty. These include the correct structure of the model itself (e.g., the variables to include and the relationships between them); how best to represent the range of possible values for model variables; and the potential impact of irreducible sources of randomness.

The third metric is what Hemez calls a model’s “predictability”. This is not the same as accuracy in reproducing historical data. Rather, it is the extent to which predictions are consistent from a group of models that are roughly equal in their ability to accurately reproduce the past and their robustness to uncertainty.

Building a model that excels at reproducing past results is unlikely to be robust to uncertainty, and will likely fail to provide equally accurate predictions in complex socio-technical systems (like markets or economies) that, unlike physical systems, are constantly evolving.

Similarly, increasing robustness to uncertainty comes at the cost of less consistency in predictions about the future. The wider the range of uncertainties you include regarding model structure and the value of model variables, the wider will be the range of forecast outcomes the model produces.

What are the practical implications of Hemez’ work for risk executives and other business leaders?

First, in complex socio-technical systems estimating a model’s ability to accurately predict the future on the basis of its ability replicate the past will likely lead to increasing errors as the forecast time horizon lengthens.

A better approach is to ask whether the results observed in the past are within the range of possible outcomes forecast by a model.

Second, using single inputs for model variables is a recipe for trouble when attempting to forecast the future outputs of a complex socio-technical system. Moreover, the traditional approach of using three model runs (representing the best, worst, and most likely cases) is unlikely to significantly improve forecast accuracy because in socio-technical systems variables tend not to all simultaneously take their best, worst, and most likely values.

A better approach is use Monte Carlo add-ins for a typical spreadsheet model that describe possible values for key variables and their relationships to each other as statistical distributions. Running such models multiple times produces statistical distributions for the forecast outcomes, which enables executives to better understand how the best and worst outcomes could arise. Yet this approach still neglects the evolution over time of key relationships between variables (and sometimes the addition of new variables) within the socio-technical system being modeled. To capture these dynamics, an approach would have to either add add multiple model structures or enable model evolution over time. A good example of this is the "ensemble" modeling approach utilized in weather forecasting, in which, for example, the UK Met, European, Canadian, and US Weather Service models are all run to produce a prediction.

Finally, confidence in prediction is increased when a number of fundamentally different modeling methodologies are used to forecast outcomes of interest in complex socio-technical systems – for example, Monte-Carlo spreadsheet models, systems dynamics models (e.g., built with Analytica, STELLA, or VenSim software), and agent based models (e.g., EAS, NetLogo, or SWARM software).

In sum, there are very practical ways we can apply Francois Hemez' insights about the tradeoffs we face when modeling the behavior of complex socio-technical systems. But we first have to recognize that they exist.
blog comments powered by Disqus