The BIN Model of Forecasting Errors and Its Implications for Improving Predictive Accuracy

At Britten Coyne Partners, Index Investor, and the Strategic Risk Institute, we all share a common mission: To help clients avoid failure by better anticipating, more accurately assessing, and adapting in time to emerging strategic threats.

Improving clients’ (and our own) forecasting process to increase predictive accuracy is critical to our mission and our research in this area is ongoing. In this note, we’ll review a very interesting new paper that has the potential to significantly improve forecast accuracy.

In “Bias, Information, Noise: The BIN Model of Forecasting”, Satopaa, Salikhov, Tetlock, and Mellers introduce a new approach to rigorously assessing the impact of three root causes of forecasting errors. In so doing, they create the opportunity for individuals and organizations to take more carefully targeted actions to improve the predictive accuracy of their forecasts.

In the paper, Satopaa et al decompose forecast errors into three parts, based on the impact of bias, partial information, and noise. They assume that, “forecasters sample and interpret signals with varying skill and thoroughness. They may sample relevant signals (increasing partial information) or irrelevant signals (creating noise). Furthermore, they may center the signals incorrectly (creating bias).”

Let’s start by taking a closer look at each root cause.

WHAT IS BIAS?

A bias is a systematic error that reduces forecast accuracy in a predictable way. Researchers have extensively studied biases and many of them have become well known. Here are some examples:

Over-Optimism: Tali Sharot’s research has shown how humans have a natural bias towards optimism. We are much more prone to updating our beliefs when a new piece of information is positive (i.e., better than expected in light of our goals) rather than negative (“How Unrealistic Optimism is Maintained in the Face of Reality”).

Confirmation: We tend to seek, pay more attention to, and place more weight on information that supports our current beliefs than information that is inconsistent with or contradicts them (this is also known as “my-side” bias).

Social Ranking: Another bias that is deeply rooted in our evolutionary past is the predictable impact of competition for status within a group. Researchers have found that when the result of a decision will be private (not observed by others), we tend to be risk averse. But when the result will be observed, we tend to be risk seeking (e.g., “Interdependent Utilities: How Social Ranking Affects Choice Behavior” by Bault et al).

Social Conformity: Another evolutionary instinct comes into play when uncertainty is high. Under this condition, we are much more likely to rely on social learning and copying the behavior of other group members, and to put less emphasis on any private information we have that is inconsistent with or contradicts the group’s dominant view. The evolutionary basis for this heightened conformity is clear – you don’t want to be cast out of your group when uncertainty is high.

Overconfidence/Uncertainty Neglect: The over-optimism, confirmation, social ranking, and social conformity biases all contribute to forecasters’ systematic neglect of when we make and communicate forecasts. We described this bias (and how to overcome it) much more detail in our February blog post, “How to Effectively Communicate Forecast Probability and Analytic Confidence”.

Surprise Neglect: While less well known than many other biases, this one is, in our experience, one of the most important. Surprise is a critically important feeling that is triggered when our conscious or unconscious attention is attracted to something that violates our expectations of how the world should behave, given our mental model of the phenomenon in question (e.g., stock market valuations increasing when macroeconomic conditions appear to be getting worse). From an evolutionary perspective, surprise helps humans to survive by forcing them to revise their mental models in order to more accurately perceive the world – especially its unexpected dangers and opportunities. Unfortunately, the feeling of surprise is often fleeting.

As Daniel Kahneman noted in his book, “Thinking Fast and Slow”, when confronted with surprise, our automatic, subconscious reasoning system (“System 1”) will quickly attempt to adjust our beliefs to eliminate the feeling. It is only when the adjustment is too big that our conscious reasoning system (“System 2”) is triggered to examine its cause us to feel surprised – but even then, System 1 will still keep trying to make the feeling of surprise disappear. Surprise neglect is one of the most underappreciated reasons that inaccurate mental models tend to persist.

WHAT IS PARTIAL INFORMATION?

In the context of the BIN Model, “information” refers to the extent to which we have complete (and accurate) information about the process generating the future results we seek to forecast.

For example, when a fair coin is flipped four times, we have complete information that enables us to predict the probability of different outcomes with complete confidence. This is the realm of decision in the face of risk.

When the complexity of the results generating process increases, we move from the realm of risk into the realm of uncertainty, in which we often do not fully understand the full range of possible outcomes, their probabilities, or their consequences.

Under these circumstances, forecasters have varying degrees of information about the process generating future results, and/or models of varying degrees of accuracy for interpreting the meaning of the information they have. Both contribute to forecast inaccuracy.

WHAT IS NOISE?

“Noise” is unsystematic, unpredictable, random errors that contribute to forecast inaccuracy. Kahneman defines it as “the chance variability of judgments.”

Sources of noise that are external to forecasters include randomness in the results generating process itself (and, as in the case of complex adaptive systems, the deliberate actions of the intelligent agents who comprise that process). Internal sources of noise include forecasters’ use of low or no value information about and/or a model of a results generating process that are either inaccurate or irrelevant, or that vary over time (often unconsciously).

After applying their analytical “BIN” framework to the results of the Good Judgment Project (a four-year geopolitical forecasting tournament described in the book “Superforecasting” in which one of us participated), Satopaa and his co-authors conclude that, “forecasters fall short of perfect forecasting [accuracy] due more to noise than bias or lack of information. Eliminating noise would reduce forecast errors … by roughly 50%; eliminating bias would yield a roughly 25% cut; [and] increasing information would account for the remaining 25%. In sum, from a variety of analytical angles, reducing noise is roughly twice as effective as reducing bias or increasing information.”

Moreover, they authors found that a variety of interventions used by the Good Judgment Project that were intended to reduce forecaster bias (such as training and teaming) actually had their biggest impact on reducing noise. They note that, “reducing bias may be harder than reducing noise due to the tenacious nature of certain cognitive biases” (a point also made by Kahneman in his Harvard Business Review article, “Noise: How to Overcome the High, Hidden Cost of Inconsistent Decision Making”).

The BIN model highlights three levers for improving forecast accuracy: Reducing Bias, Improving Information, and Reducing Noise. Let’s look at some effective techniques in each of these areas.

HOW TO REDUCE BIAS

The first point to make about bias reduction is that a considerable body of research has concluded that this is very difficult to do, for the simple reason that deep in our evolutionary past, what we negatively refer to as “biases” served a positive evolutionary purpose (e.g., overconfidence helped to attract mates).

That said, both our experience with Britten Coyne Partners’ clients and academic research has found that two techniques are often effective.

Reference/Base Rates and Shrinkage: Too often we behave as if the only information that matters is what we know about the question or results we are trying to forecast. We fail to take into account how things have turned out in similar cases in the past (this is know as the reference or base rate). So-called “shrinkage” methods start by identifying a relevant base rate for the forecast, then move on to developing a forecast based on the specific situation under consideration. The more similar the specific situation is to the ones used to calculate the base rate, the more the specific probability is “shrunk” towards the base rate probability.

Pre-Mortem Analysis: Popularized by Gary Klein, in a pre-mortem a team is told to assume that it is some point in the future, and a forecast (or plan) has failed. They are told to anonymously write down the causes of the failure, including critical signals that were missed, and what could have been done differently to increase the probability of success. Pre-mortems reduce over-optimism and overconfidence, and produce two critical outputs: improvement in forecasts and plans, and the identification of critical uncertainties about which more information needs to be collected as the future unfolds (which improves signal quality). The power of pre-mortems is due to the fact that humans have a much easier time (and are much more detailed) in explaining the past than they are when asked to forecast the future – hence the importance of situating a group in the future, and asking them to explain a past that has yet to occur.

HOW TO INCREASE RELEVANT INFORMATION (SIGNAL)?

Hyperconnectivity has unleashed upon us a daily flood of information with which many people are unable to cope when they have to make critical decisions (and forecasts) in the face of uncertainty. Two approaches can help.

Information Value Analysis: Bayes Theory provides a method for separating high value signals from the flood of noise that accompanies them. Let’s say that you have determined that three outcomes (A, B, and C) are possible. The value of a new piece of information (or related pieces of evidence) can be determined based on the likelihood you would observe it if Outcomes A, B, or C happen. If the information is much more likely to be observed in the case of just one outcome, it has high value. If it is equally likely under all three outcomes, it has no value. More difficult, but just as informative, is applying this same logic to the absence of a piece of information. This analysis (and the outcome probability estimates) should be repeated at regular intervals to assess newly arriving evidence.

Assumptions Analysis: Probabilistic forecasts rest on a combination of (1) facts, (2) assumptions about critical uncertainties, (3) the evidence (of varying reliability and information value) supporting those assumptions, and (4) the logic used to reach the forecaster’s conclusion. In our forecasting work with clients over the years, we have found that discussing the assumptions made about critical uncertainties, and, less frequently the forecast logic itself, generates very productive discussions and improves predictive accuracy.

In particular, Marvin Cohen’s approach has proved quite practical. His research found that the greater the number of assumptions about “known unknowns” (i.e., recognized uncertainties) that underlie a forecast, and the weaker the evidence that supports them, the lower confidence one should have in the forecast’s accuracy.

Also, the more assumptions about “known unknowns” that are used in a forecast, the more likely it is that potentially critical “unknown unknowns” remain to be discovered, which again should lower your confidence in the forecast (e.g., see, “Metarecognition in Time-Stressed Decision Making: Recognizing, Critiquing, and Correcting” by Cohen, Freeman, and Wolf).

HOW TO REDUCE NOISE?

Combine and Extremize Forecasts: Research has found that three steps can improve forecast accuracy. The first is seeking forecasts based on different forecasting methodologies, or prepared by forecasters with significantly different backgrounds (as a proxy for different mental models and information). The second is combining those forecasts (using a simple average if few are included, or the median if many are). The final step, which significantly improved the performance of the Good Judgment Project team in the IARPA forecasting tournament, is to “extremize” the average (mean) or median forecast by moving it closer to 0% or 100%.

Averaging forecasts assumes that the differences between them are all due to noise. However, as the number of forecasts being combined increases, use of the median produces the greatest increase in accuracy because it does not “average away” all information differences between forecasters. However, that still leaves whatever bias is present in the median forecast.

Forecasts for binary events (e.g., the probability an event will or will not happen within a given time frame) are most useful to decision makers when they are closer to 0% or 100% rather than the uninformative “coin toss” estimate of a 50% probability. As described by Baron et al in “Two Reasons to Make Aggregated Probability Forecasts More Extreme”, individual forecasters will often shrink their probability estimates towards 50% to take into account their subjective belief about the extent of potentially useful information that they are missing.

For this reason, forecast accuracy can usually be increased when you employ a structured “extremizing” technique to move the mean or median probability estimate closer to 0% or 100%. Note that the extremizing factor should be lower when average forecaster expertise is higher. This is based on the assumption that a group of expert forecasters will incorporate more of the full amount of potentially useful information than will novice forecasters. (See: “Two Reasons to Make Aggregated Probability Forecasts More Extreme”, by Baron et al, and “Decomposing the Effects of Crowd-Wisdom Aggregators”, by Satopaa et al).

Use a Forecasting Algorithm: Use of an algorithm (whose structure can be inferred from top human forecasters’ performance) ensures that a forecast’s information inputs and their weighting are consistent over time. In some cases, this approach can be automated; in others, it involves having a group of forecasters (e.g., interviewers of new hire candidates) ask the same set of questions and use the same rating scale, which facilitates the consistent combination of their inputs. Kahneman has also found that testing these algorithmic conclusions against forecasters’ intuition, and then examining the underlying reasons for any disagreements between the results of the two methods can sometimes improve results.

However, it is also critical to note that this algorithmic approach implicitly assumes that the underlying process generating the results being forecast is stable over time. In the case of complex adaptive systems (which are constantly evolving), this is not true.

Unfortunately, many of the critical forecasting challenges we face involve results produced by complex adaptive systems. For the foreseeable future, expert human forecasters will still be needed to meet them. By decomposing the root causes of forecasting error, the BIN model will help them to do that.

Comments

Complexity, Wicked Problems, and AI-Augmented Decision Making

Over the years, some of the most thought provoking research we have read on the practical implications and applications of complex adaptive systems theory has come from people who have never received the recognition their thinking deserves. One is Dietrich Dorner and his team at Otto-Friedrich University in Bamberg, Germany (see his book, The Logic of Failure). Another is Anne-Marie Grisogono, who worked for years at Defense Science and Technology Australia and has recently left there for academia, at Flinders University in Adelaide, Australia.

Grisogono recently published “How Could Future AI Help Tackle Global Complex Problems?” It is a great synthesis of the challenges for decision makers posed by increasing complexity and how improving artificial intelligence technologies could one day help meet them.

She begins by noting that, “we can define intelligence as the ability to produce effective responses or courses of action that are solutions to complex problems—in other words, problems that are unlikely to be solved by random trial and error, and that therefore require the abilities to make finer and finer distinctions between more and more combinations of relevant factors and to process them so as to generate a good enough solution.”

Grisogono then links this definition of intelligence to the emergence and growth of complexity. “Obviously [finding good enough solutions] becomes more difficult as the number of possible choices increases, and as the number of relevant factors and the consequence pathways multiply. Thus complexity in the ecosystem environment generates selection pressure for effective adaptive responses to the [increasing] complexity.”

“One possible adaptive strategy is to find niches to specialize for, within which the complexity is reduced. The opposite strategy is to improve the ability to cope with the complexity by evolving increased intelligence at an individual level, or collective intelligence through various types of cooperative or mutualistic relationships. Either way, increased intelligence in one species will generally increase the complexity of the problems they pose for both other species in the shared ecosystem environment, and for their own conspecifics, driving yet further rounds of adaptations. Even when cooperative interactions evolve to deal with problems that are more complex than an individual can cope with, the shared benefits come with a further complexity cost”…

That said, “it is evident that human intelligence and ingenuity have led to immense progress in producing solutions for many of the pressing problems of past generations, such as higher living standards, longer life expectancy, better education and working conditions. But it is equally evident that the transformations they have wrought in human society and in the planetary environment include many harmful unintended consequences, and that the benefits themselves are not equitably distributed and have often masked unexpected downsides…

“This ratcheting dynamic of increasing intelligence and increasing complexity continues as long as two conditions are met: further increases in sensing and processing are sufficiently accessible to the evolutionary process, and the selection pressure is sufficient to drive it. Either condition can fail. Thus generally a plateau of dynamic equilibrium is reached. But it is also possible that under the right conditions, which we will return to below, the ratcheting of both complexity and intelligence may continue and accelerate.”

Grisogono then moves on to a fascinating and admirably succinct discussion of “what we have learned about the specific limitations that plague human decision-makers in complex problems. We can break this down into two parts: the aspects of complex problems that we find so difficult, and what it is about our brains that limits our ability to cope with those aspects.”

She begins by noting that, “Interdependence is a defining feature of complexity and has many challenging and interesting consequences. In particular, the network of interdependencies between different elements of the problem means that it cannot be successfully treated by dividing it into sub-problems that can be handled separately. Any attempt to do that creates more problems than it solves because of the interactions between the partial solutions…

“There is no natural boundary that completely isolates a complex problem from the context it is embedded in. There is always some traffic of information, resources, and agents in and out of the situation that can bring about unexpected changes, and therefore the context cannot be excluded from attention…

“Complex problems exist at multiple scales, with different agents, behaviors and properties at each, but with interactions between scales. This includes emergence, the appearance of complex structure and dynamics at larger scales as a result of smaller-scale phenomena, and its converse, top-down causation, whereby events or properties at a larger scale can alter what is happening at the smaller scales. In general, all the scales are important, and there is no single “right” scale at which to act…

“Interdependence implies multiple interacting causal and influence pathways leading to, and fanning out from, any event or property, so simple causality (one cause—one effect), or linear causal chains will not hold in general. Yet much of our cultural conditioning is predicated on a naïve view of linear causal chains, such as finding “the cause” of an effect, or “the person” to be held responsible for something, or “the cure” for a problem. Focusing on singular or primary causes makes it more difficult to intervene effectively in complex systems and produce desired outcomes without attendant undesired ones—so-called “side-effects” or unintended consequences…

“Furthermore, such networks of interactions between contributing factors can produce emergent behaviors which are not readily attributable or intuitively anticipatable or comprehensible, implying unknown risks and unrecognized opportunities” ...

“Many important aspects of complex problems are hidden, so there is inevitable uncertainty as to how the events and properties that are observable, are linked through causal and influence pathways, and therefore many hypotheses about them are possible. These cannot be easily distinguished based on the available evidence…

As if complexity isn’t enough, “there are generally multiple interdependent goals in a complex problem, both positive and negative, poorly framed, often unrealistic or conflicted, vague or not explicitly stated, and stakeholders will often disagree on the weights to place on the different goals, or change their minds. Achieving sufficient high level goal clarity to develop concrete goals for action is in itself a complex problem…

Grigogono then summarizes the cognitive abilities that are needed to successfully engage with complex problems.

“One immediate conclusion that can be drawn is that there is a massive requirement for cognitive bandwidth—not only to keep all the relevant aspects at all the relevant scales in mind as one seeks to understand the nature of the problem and what may be possible to do, but even more challenging, to incorporate appropriate non-linear dynamics as trajectories in time are explored…

“But there is a more fundamental problem that needs to be addressed first: how to acquire the necessary relevant information about the composition, structure and dynamics of the complex problem and its context at all the necessary scales, and revise and update it as it evolves. This requires a stance of continuous learning, i.e., simultaneous sensing, testing, learning and updating across all the dimensions and scales of the problem, and the ability to discover and access relevant sources of information. At their best, humans are okay at this, up to a point, but not at the sheer scale and tempo of what is required in real world complex problems which refuse to stand still while we catch up…

“To understand how all these factors interact to limit human competence in managing complex problems, and what opportunities might exist for mitigating them through advanced AI systems, we now review some key findings from relevant research.

“In particular we are interested in learning about the nature of human decision-making in the context of attempting to manage an ongoing situation which is sufficiently protracted and complex to defeat most, but not all, decision-makers.

“Drawing useful conclusions about the detailed decision-making behaviors that tend to either sow the seeds of later catastrophes, or build a basis for sustained success, calls for an extensive body of empirical data from many diverse human subjects making complex decisions in controllable and repeatable complex situations. Clearly this is a tall ask, so not surprisingly, the field is sparse.

"However, one such research program [led by Dietrich Dorner and his team], which has produced important insights about how successful and unsuccessful decision-making behaviors differ, stands out in having also addressed the underlying neurocognitive and affective processes that conspire to make it very difficult for human decision-makers to maintain the more successful behaviors, and to avoid falling into a vicious cycle of less effective behaviors.

“In brief, through years of experimentation with human subjects attempting to achieve complex goals in computer-based micro-worlds with complex underlying dynamics, the specific decision-making behaviors that differentiated a small minority of subjects who achieved acceptable outcomes in the longer term, from the majority who failed to do so, were identified. Results indicated that most subjects could score some quick wins early in the game, but as the unintended consequences of their actions developed and confronted them, and their attempts to deal with them created further problems, the performance of the overwhelming majority (90%) quickly deteriorated, pushing their micro-worlds into catastrophic or chronic failure.

“As would be expected, their detailed behaviors reproduced many well-documented findings about the cognitive traps posed by human heuristics and biases. Low ambiguity tolerance was found to be a significant factor in precipitating the behavior of prematurely jumping to conclusions about the problem and what was to be done about it, when faced with situational uncertainty, ambiguity and pressure to achieve high-level goals. The chosen (usually ineffective) course of action was then defended and persevered with through a combination of confirmation bias, commitment bias, and loss aversion, in spite of available contradictory evidence.

"The unfolding disaster was compounded by a number of other reasoning shortcomings such as difficulties in steering processes with long latencies and in projecting cumulative and non-linear processes. Overall they had poor situation understanding, were likely to focus on symptoms rather than causal factors, were prone to a number of dysfunctional behavior patterns, and attributed their failures to external causes rather than learning from them and taking responsibility for the outcomes they produced.

“By contrast, the remaining ten percent who eventually found ways to stabilize their micro-world, showed systematic differences in their decision-making behaviors and were able to counter the same innate tendencies by taking what amounts to an adaptive approach, developing a conceptual model of the situation, and a stratagem based on causal factors, seeking to learn from unexpected outcomes, and constantly challenging their own thinking and views. Most importantly, they displayed a higher degree of ambiguity tolerance than the unsuccessful majority.

These findings are particularly significant here because most of the individual human decision-making literature has concentrated on how complex decision-making fails, not on how it succeeds. However, insights from research into successful organizational decision-making in complex environments corroborate the importance of taking an adaptive approach.

“In summary, analysis of the effective decision behaviors offers important insights into what is needed, in both human capabilities and AI support, to deal with even higher levels of complexity beyond current human competence. There are two complementary aspects here—put simply: how to avoid pitfalls (what not to do), and how to adopt more successful approaches (what to do instead).

“It is not difficult to understand how the decision making behaviors associated with the majority contributed to their lack of success, nor how those of the rest enabled them to develop sufficient conceptual and practical understanding to manage and guide the situation to an acceptable regime. Indeed if the two lists of behaviors are presented to an audience, everyone can readily identify which list leads to successful outcomes and which leads to failure.

"Yet if those same individuals are placed in the micro-world hot seat, 90% of them will display the very behaviors they just identified as likely to be unsuccessful. This implies that the displayed behaviors are not the result of conscious rational choice, but are driven to some extent by unconscious processes...

“This observation informed development of a theoretical model [by Dorner and his team] incorporating both cognitive and neurophysiological processes to explain the observed data. In brief, the model postulates two basic psychological drives that are particularly relevant to complex decision making, a need for certainty and a need for competence. These are pictured metaphorically as tanks that can be topped up by signals of certainty (one’s expectations being met) and signals of competence (one’s actions producing desired outcomes), and drained by their opposites—surprises and unsuccessful actions.

“The difference between the current level and the set point of a tank creates a powerful unconscious need, stimulating some behavioral tendencies and suppressing others, and impacting on cognitive functions through stimulation of physiological stress. If both levels are sufficient the result is motivation to explore, reflect, seek information and take risky action if necessary—all necessary components of effective decision making behavior.

"But if the levels get too low the individual becomes anxious and is instead driven to flee, look for reassurance from others, seek only information that confirms his existing views so as to top up his dangerously low senses of certainty and competence, and deny or marginalize any tank draining contradictory information…

“The impacts of stress on cognitive functions reinforce these tendencies by reducing abilities to concentrate, sustain a course of action, and recall relevant knowledge. Individuals whose tanks are low therefore find it difficult to sustain the decision-making behaviors associated with success, and are likely to act in ways that generate further draining signals, digging themselves deeper into a vicious cycle of failure.

“We can now understand the 90:10 ratio, as the competing attractors are not symmetric—the vicious cycle of the less effective decision behaviors is self-reinforcing and robust, while the virtuous cycle of success is more fragile because one’s actions are not the sole determinant of outcomes in a complex situation, so even the best decision-makers will sometimes find their tanks getting depleted, and therefore have difficulty sustaining the more effective decision making behaviors.

“Further research has demonstrated that the more effective decision making behaviors are trainable to some extent, but because they entail changing meta-cognitive habits they require considerable practice, reinforcement and ongoing support.

"However, the scope for significant enhancement of unaided human complex decision making competence is limited—not only in the level of competence achievable, but also and more importantly, in the degree of complexity that can be managed. Meanwhile, the requirements for increased competence, and the inexorable rise in degree of complexity to be managed, continue to grow.”

In the remainder of the paper, Grisogono lays out the requirements for an AI system that could substantially improve our ability to make good decisions when confronted with complex, wicked problems. She concludes that current AI technology is far from what we need.

"Despite its successes, the best examples of AI are still very specialized applications that focus on well-defined domains, and that generally require a vast amount of training data to achieve their high performance. Such applications can certainly be components of an AI decision support system for managing very complex problems, but the factors [already] discussed imply that much more is needed: not just depth in narrow aspects, but breadth of scope by connecting the necessary components so as to create a virtual environment which is a sufficiently valid model of the problem and its context, and in which decision-makers can safely explore and test options for robustness and effectiveness, while being supported in maintaining effective decision making behaviors and resisting the less effective ones.”

Until AI-based decision support systems like this are developed, human beings’ batting average in successfully resolving the growing number of wicked problems we face is destined to remain low, and our few successes heavily dependent on a very small set of uniquely talented people who have a superior intuitive grasp of the nature and behavior of complex adaptive systems. In the short-term and medium-term, our critical challenge is how to increase their number.
Comments

How to Effectively Communicate Forecast Probability and Analytic Confidence

At Britten Coyne Partners, we have often observed that research on issues related to anticipating, assessing, and adapting in time to emergent strategic threats is poorly shared across the military, intelligence, academic, and practitioner communities. This post is another of our ongoing attempts to share key research findings across these silos.

David Mandel is a senior scientist at Defense Research and Development Canada, specializing in intelligence, influence, and collaboration issues. Based on our review of the research, we regard Mandel as a world leader in the effective communication of forecast probability and uncertainty, which is the subject of this post.

Background

Many analysts agree that, even before the COVID pandemic arrived, the world had entered a period of “unprecedented” or “radical” uncertainty and disruptive change.

In this environment, avoiding strategic failure in part depends on effectively meeting three forecasting challenges:

• Asking the right forecasting questions;
• Accurately estimating the probability of different outcomes; and
• Effectively communicating the degree and nature of the uncertainty associated with your forecast.

As we have noted in past posts on our Strategic Risk Blog, as well as in our Strategic Risk Governance and Management course, techniques to help forecasters ask the right questions have received moderate attention.

That said, some powerful methods have been developed, including scenario analysis (which I first encountered in 1984 when taking a course from Pierre Wack, who popularized it at Shell); prospective hindsight, such as Gary Klein’s pre-mortem method; and Robert Lempert and Steven Bankes’ exploratory ensemble modeling approach.

In contrast to the challenge of asking the right questions, much greater attention has been paid to the development of methods to help analysts accurately forecast the answers to them, particularly in the context of complex adaptive systems (which generate most of the uncertainty we confront today).

In addition to the extensive research on this challenge conducted by the intelligence and military communities, we have also recently seen many excellent academic and commercial works, including best selling books like “Future Babble” by Dan Gardner, “Superforecasting” by Philip Tetlock and Dan Gardner, and “The Signal and the Noise” by Nate Sliver.

Compared to the first two challenges, the critical issue of forecast uncertainty, and in particular how to effectively communicate it, has received far less attention.

Some authors have constructed taxonomies to describe the sources of forecast uncertainty (e.g., “Classifying and Communicating Uncertainties in Model-Based Policy Analysis," by Kwakkel, Walker, and Marchau).

Other analysts have attempted to estimate the likely extent of forecast uncertainty in complex adaptive systems.

For example, in “The Prevalence Of Chaotic Dynamics In Games With Many Players”, Sanders et al find that in games where players can take many possible actions in every period in pursuit of their long-term goals (which may differ), system behavior quickly becomes chaotic and unpredictable as the number of players increases. The authors conclude that, “complex non-equilibrium behavior, exemplified by chaos, may be the norm for complicated games with many players.”

In “Prediction and Explanation in Social Systems”, Hoffman et al also analyze the limits to predictability in complex adaptive social systems.

They observe, “How predictable is human behavior? There is no single answer to this question because human behavior spans the gamut from highly regular to wildly unpredictable. At one extreme, a study of 50,000 mobile phone users found that in any given hour, users were in their most visited location 70% of the time; thus, on average, one could achieve 70% prediction accuracy with the simple heuristic, ‘Jane will be at her usual spot today’.”

“At the other extreme, so-called ‘black swan’ events are thought to be intrinsically impossible to predict in any meaningful sense. Last, for outcomes of intermediate predictability, such as presidential elections, stock market movements, and feature films’ revenues, the difficulty of prediction can vary tremendously with the details of the task.”

The authors note that, “the more that outcomes are determined by extrinsic random factors, the lower the theoretical best performance that can be attained by any method.”

In “Exploring Limits to Prediction in Complex Social Systems”, Martin et al also address the question, “How predictable is success in complex social systems?” To analyze it, they evaluate the ability of multiple methodologies to predict the size and duration of Twitter cascades.

The authors conclude that, “Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes … This result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy.”

“Although higher predictive power [than what we achieved] is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty … leads to substantially more restrictive bounds on predictability … We conclude that such bounds [on predictability] for other complex social systems for which data are more difficult to obtain are likely even lower.”

In sum, forecasts of future outcomes produced by complex adaptive systems (e.g., the economy, financial markets, product markets, interacting combatants, etc.) are very likely to be accompanied by a substantial amount of uncertainty.

David Mandel’s Insights

A critical question is how to effectively communicate a forecast’s probability and its associated uncertainty to decision makers.

A recent review concluded that, given its importance, this is an issue that surprisingly has not received much attention from researchers (“Communicating Uncertainty About Facts, Numbers And Science”, by van der Bles et al).

That is somewhat strange, because this is not a new problem.

For example, in 1964 the CIA’s Sherman Kent published his confidential memo on “Words of Estimative Probability”, which highlighted the widely varying numerical probabilities that different people attached to verbal expressions such as “possible”, “likely”, “probable”, or “almost certain”. Over the succeeding fifty years, multiple studies have replicated and extended Kent’s conclusions.

Yet in practice, verbal expressions of estimative probability, without accompanying quantitative expressions, still widely used.

For example, it was only after recommendations from the 9/11 Commission Report, and direction by the Intelligence Reform and Terrorism Prevention Act (IRTPA) of 2004, that on 21 June 2007 the Office of the Director of National Intelligence (DNI) released Intelligence Community (IC) Directive (ICD) 203.

This Directive established intelligence community-wide analytic standards intended to, “meet the highest standards of integrity and rigorous analytic thinking.”

ICD 203 includes the following table for translating “words of estimative probability” into quantitative probability estimates:

ICD 203

In our experience, nobody has written more about these issues than David Mandel.

To be sure, “Assessing Uncertainty in Intelligence” by Friedman and Zeckhauser is an important paper. However, it pales in comparison to the volume and breadth of Mandel’s research, including his contributions to and editorship of NATO’s exhaustive June 2020 report on “Assessment and Communication of Uncertainty in Intelligence to Support Decision-Making”.

In what follows, we’ll review some of his Mandel’s key findings, insights, and recommendations in three critical areas: (1) Communicating probability forecasts; (2) Communicating the degree of forecast uncertainty (or “analytic confidence”); and (3) Why organizations have been reluctant to adopt what researchers have found to be the most effective practices in both these areas.

Effectively Communicating Probability Forecasts

“As Sherman Kent aptly noted [in 1964], substantive intelligence is largely human judgment made under conditions of uncertainty. Among the most important assessments are those that not only concern unknowns but also potentially unknowables, such as the partially formed intentions of a leader in an adversarial state.”

“In such cases, the primary task of the analyst is not to state what will happen but to accurately assess the probabilities of alternative possibilities as well as the degree of error in the assessments and to giver clear explanations for the basis of such assessments.” (Source: “Intelligence, Science, and the Ignorance Hypothesis”, by David Mandel).

“Most intelligence organizations today use some variant of the Kent-Foster approach. That is, they rely on curated sets of linguistic probability terms presented as ordered scales. Previously, some of these scales did not use numeric probability equivalencies. However, nowadays most standards assign numeric ranges to stipulate the meaning of each linguistic probability term.”

“Efforts to transform the vagueness of natural language into something clearer reflect a noble goal, but the curated-list approach is flawed in practice and in principle. For example, flaws in practice include the fact that each standard uses a common approach, yet each differs sufficiently to undermine interoperability among key collaborative partners; e.g., an even chance issued by NATO could mean unlikely, roughly even chance, or likely in the US system.”

“Current standards also prevent analysts from communicating probabilities less than 1% or greater than 99%. This pre-empts analysts from distinguishing “one in a hundred” from “one in a million.” In the US standard, “one in a hundred” is the smallest communicable probability, while in the NATO and UK standards, “one in a million” would be indistinguishable from “one in ten.” Orders of magnitude should matter to experts because orders of magnitude matter in everyday life. A threat that has a 10% chance of occurring may call for a different response than if it had a one-in-a-million chance of occurring instead.”

“Intelligence organizations have naively assumed that they can quash the unruliness of linguistic probabilities simply by stating their intended meaning. Yet ample research shows that when people have direct access to a translation table, a large proportion still interprets linguistic expressions inconsistently with the prescribed meanings.”

“Noting the abysmal rates of shared understanding when probability lexicons are provided, researchers have recommended that numeric ranges be reported alongside linguistic probabilities in assessments [as in ICD 203]. However, this approach has yielded only modest improvements in shared understanding.”

“Studies show that people generally prefer to communicate probabilistic information linguistically, but that they also prefer to receive it numerically. These preferences are exhibited across a range of expert judgment communities, but are particularly pronounced when judgments are based on unreliable or incomplete information, as is characteristic of intelligence analysis.”

“Decision-makers want useful (i.e., timely, relevant, and accurate) information to support their decisions; they don’t wish to be reminded repeatedly what probability terms should mean to them when consuming intelligence. Any standard that encourages analysts to express anything other than their best probability estimate for the event being judged is suboptimal.”

Mandel also stresses that, “Explanation is [also] vital to intelligence since without it, a decision-maker would not know how the particular assessment was reached. Numeric assessments and clear explanations should work together to yield effective intelligence.”

(Source: “Uncertainty, Intelligence, and National Security Decision Making”, by David Mandel and Daniel Irwin).

Related research has also found that allowing forecasters to use narrower probability ranges than those specified in national guidelines like ICD-203. (See “The Value of Precision in Probability Assessment: Evidence from a Large-Scale Geopolitical Forecasting Tournament”, by Friedman et al).

Another problem is that, “Linguistic probabilities also convey ‘directionality,’ a linguistic feature related to but distinct from probability.

“Directionality is a characteristic of probabilistic statements that calls attention to the potential occurrence or non-occurrence of an event. For instance, if someone tells you there is some chance they will make it to an event, you will probably be more inclined to expect them to attend than if they had said it was doubtful, even though both terms tend to be understood as conveying low probabilities … These implicit suggestions can influence decision-making outside of the decision-maker’s awareness.” (Source: “Uncertainty, Intelligence, and National Security Decision Making”, by David Mandel and Daniel Irwin).

“Communicating probabilities numerically rather than verbally also benefits forecasters’ credibility. Verbal probabilities convey implicit recommendations more clearly than probability information, whereas numeric probabilities do the opposite. Prescriptively, we propose that experts distinguish forecasts from advice, using numeric probabilities for the former and well-reasoned arguments for the latter.” (Source: “Cultivating Credibility With Probability Words And Numbers”, by Robert Collins and David Mandel).


Effectively Communicating Forecast Confidence (or Uncertainty)

Probabilistic forecasts are based rest on a combinations of (1) facts, (2) assumptions about critical uncertainties; (3) the evidence (of varying reliability and information value) supporting those assumptions; and (4) the logic used to reach the forecaster’s conclusion.

A forecast is typically assessed either directly (by judging the strength of its assumptions and logic), or indirectly, on the basis of the forecaster’s stated confidence in her/his conclusions.

In our forecasting work with clients over the years, we have found that discussing the assumptions made about critical uncertainties, and, less frequently the forecast logic itself, generates very productive discussions and improved predictive accuracy.

In particular, we have found Marvin Cohen’s approach quite practical. His research found that the greater the number of assumptions about “known unknowns” [i.e., recognized uncertainties] that underlie a forecast, and the weaker the evidence that supports them, the lower confidence one should have in the forecast’s accuracy.

Cohen also cautions that the more assumptions about “known unknowns” that are used in a forecast logic, the more likely it is that more potentially critical “unknown unknowns” remain to be discovered, which again should lower your confidence in the forecast (e.g., see, “Metarecognition in Time-Stressed Decision Making: Recognizing, Critiquing, and Correcting”, by Cohen, Freeman, and Wolf).

Mandel focuses on expressions of “analytic confidence” in a forecast, which are the established practice in the intelligence world.

In a number of different publications, he highlights many shortcomings in the ways that analytic confidence is currently communicated to users of estimative probability forecasts.

“Given that intelligence is typically derived from incomplete and ambiguous evidence, analysts must accurately assess and communicate their level of uncertainty to consumers. One facet of this perennial challenge is the communication of analytic confidence, or the level of confidence that an analyst has in his or her judgments, including those already qualified by probability terms such as “very unlikely” or “almost certainly”.

“Analytic confidence levels indicate the extent to which “assessments and estimates are supported by information that varies in scope, quality and sourcing.”

“Consumers [i.e., forecast users] are better equipped to make sound decisions when they understand the methodological and evidential strength (or flimsiness) of intelligence assessments. Effective communication of confidence also militates against the pernicious misconception that the Intelligence Community (IC) is omniscient.”

“Most intelligence organizations have adopted standardized lexicons for rating and communicating analytic confidence. These standards provide a range of confidence levels (e.g., high, moderate, low), along with relevant rating criteria…

“There is evidence that expressions of confidence are easily misinterpreted by consumers … There is also evidence that the terms stipulated in confidence standards are misunderstood (or at least misapplied) by intelligence practitioners.”

“For example, here is the three level confidence scale used by the Canadian Forces Intelligence Command (CFINTCOM):

Canada Confidence


In the CFINTCOM framework, “Analytic confidence is based on three main factors:

(1) Evidence: “the strength of the knowledge base, to include the quality of the evidence and our depth of understanding about the issue.”

(2) Assumptions: “the number and importance of assumptions used to fill information gaps.”

(3) Reasoning: “the strength of the logic underpinning the argument, which encompasses the number and strength of analytic inferences as well as the rigour of the analytic methodology applied to the product.”

To show how widely standards for communicating forecast confidence vary, Mandel contrasts those used by intelligence and military organizations with the framework and ratings used by the Intergovernmental Panel on Climate Change (IPCC):

IPCC Confidence


After comparing the approaches used by different NATO members, Mandel finds that, “The analytic confidence standards examined generally incorporate the following determinants:

• Source reliability;
• Information credibility;
• Evidence consistency/convergence;
• Strength of logic/reasoning; and
• Quantity and significance of assumptions and information gaps.”

However, he also notes that, “few [national] standards attempt to operationalize these determinants or outline formal mechanisms for evaluation. Instead, they tend to provide vague, qualitative descriptions for each confidence level, which may lead to inconsistent confidence assessments.”

“Issues may also arise from the emphasis most standards place on evidence convergence as a determinant of analytic confidence … Convergence can help eliminate false assumptions and false/deceptive information, but may not necessarily prevent analysts from deriving high confidence from outdated information. Under current standards, a large body of highly credible and consistent information could contribute to high analytic confidence, despite being out of date. A possible solution would be to incorporate a measure of information recency”.

“The emphasis on convergence may also lead analysts to inflate their confidence by accumulating seemingly useful but redundant information” (e.g., multiple reports based on same underlying data).

“In evaluating information convergence, confidence standards also fail to weigh the reliability of confirming sources against disconfirming sources, or how relationships between sources may unduly influence their likelihood of convergence. Focusing heavily on convergence can also introduce order effects, whereby information received earlier faces fewer hurdles to being judged credible.”

Mandel concludes, “It is unlikely that current analytic confidence standards incorporate all relevant determinants. For instance, confidence levels, as traditionally expressed, fail to consider how much estimates might shift with additional information, which is often a key consideration for consumers deciding how to act on an estimate.

“Under certain circumstances, the information content of an assessment may be less relevant to decision makers than how much that information (and the resultant forecast estimate) may change in the future. Analytic confidence scales could incorporate a measure of “responsiveness,” expressed as the probability that an estimate will change due to additional collection and analysis over a given time period (e.g., there is a 70% chance of x, but by the end of the month, there is a 50% chance that additional intelligence will increase the estimated likelihood of x to 90%).”

“In addition to responsiveness and evidence characteristics, current conceptions of analytic confidence fail to convey the level of consensus or range of reasonable opinion about a given estimate. Analysts can arguably assess uncertainty more effectively when the range of plausible viewpoints is narrower, and evidence characteristics and the range of reasonable opinion vary independently.”

For example, “In climate science, different assumptions between scientific models can lead researchers to predict significantly different outcomes using the same data. For this reason, current climate science standards incorporate model agreement/consensus as a determinant of analytic confidence.”

(Source: “How Intelligence Organizations Communicate Confidence (Unclearly)”, by Daniel Irwin and David Mandel).

Mandel also observes that, “analysts are usually instructed to assess probability and confidence as if they were independent constructs. This fails to explain that confidence is a second-order judgment of uncertainty capturing one’s subjective margin of error in a probabilistic estimate. That is, the less confident analysts are in their estimates, the wider their credible probability intervals should be.

“An analyst who believes the probability of an event lies between 50% and 90% (i.e., 70% plus or minus 20%) is less confident than an analyst who believes that the probability lies between 65% and 75% (i.e., 70% plus or minus 5%). The analyst providing the wider margin of error plays it safer than the analyst providing the narrower interval, presumably because the former is less confident than the latter.”

(Source: “Uncertainty, Intelligence, and National Security Decision Making”, by David Mandel and Daniel Irwin).


Organizational Obstacles to Adopting More Effective Methods

In words that are equally applicable to the private sector forecasts, Mandel notes that, “intelligence analysis and national security decision-making are pervaded by uncertainty. The centrality of uncertainty to decision-making at the highest policy levels underscores the primacy of accurately assessing and clearly communicating uncertainties to decision-makers. This is a central analytic function of intelligence.”

“Most substantive intelligence is not fact but expert judgment made under uncertainty. Not only does the analyst have to reason through uncertainties to arrive at sound and hopefully accurate judgments, but the uncertainties must also be clearly communicated to policymakers who must decide how to act upon the intelligence.”

“Thomas Fingar, former US Deputy Director of National Intelligence, described the role of intelligence as centrally focusing on reducing uncertainty for the decision-maker. While analysts cannot always reduce uncertainty, they should be able to accurately estimate and clearly communicate key uncertainties for decision-makers.”

“Given the importance of uncertainty in intelligence, one might expect the intelligence community to draw upon relevant science aimed at effectively handling uncertainty, much as it has done to fuel its vast collections capabilities. Yet remarkably, methods for uncertainty communication are far from having been optimized, even though the problem of uncertainty communication has resurfaced in connection with significant intelligence failures.”

We could make the same argument about the importance of accurately assessing uncertainty and emerging strategic threats in the private sector, and its association with many corporate failures. As directors, executives, and consultant, we have frequently observed the absence of best practices for communicating forecast uncertainty in private sector organizations around the world.

Mandel goes on, “Given the shortcomings of the current approach to uncertainty communication and the clear benefits of using numeric probabilities, why hasn’t effective reform happened?”

“In part, organizational inertia reflects the fact that most intelligence consumers have limited time in office, finite political capital, and crowded agendas. Efforts to tackle intelligence-community esoterica deplete resources and promise little in the way of electoral payoff. High turnover of elected officials also ensures short collective memory; practitioners can count on mistakes being forgotten without having to modify their tradecraft [i.e., analytical practices]. Even when commissions are expressly tasked with intelligence reform, they often lack the requisite knowledge base, resulting in superficial solutions.”

“Beyond these institutional barriers, intelligence producers and consumers alike may view it in their best interests to sacrifice epistemic quality in intelligence to better serve other pragmatic goals.”

“For forecast consumers, linguistic probabilities provide wiggle room to interpret intelligence estimates in ways that align with their policy preconceptions and preferences—and if things go wrong, they have the intelligence community to blame for its lack of clarity. Historically, intelligence consumers have exploited imprecision to justify decisions and deflect blame when they produced negative outcomes.”

Unfortunately, that’s equally true in the private sector.

(Source: “Uncertainty, Intelligence, and National Security Decision Making”, by David Mandel and Daniel Irwin).

However, it’s not just forecast consumers who are to blame for the current state of affairs. As Mandel notes, “Given that there is far more to lose by overconfidently asserting claims that prove to be false than by underconfidently making claims that prove to be true, intelligence organizations are likely motivated to make timid forecasts that water down information value to decision-makers—a play-it-safe strategy that anticipates unwelcome entry into the political blame games that punctuate history.”

(Source: “Intelligence, Science and the Ignorance Hypothesis”, by David Mandel)

Conclusion

As we noted at the outset, even before the COVID pandemic arrived the world had entered a period of unprecedented or radical uncertainty and disruptive change.

In this environment, avoiding failure in part depends on effectively meeting three forecasting challenges:

• Asking the right forecasting questions;
• Accurately estimating their possible outcomes; and
• Effectively communicating the degree and nature of the uncertainty associated with your forecast.

Meeting these challenges has proven to be difficult in the world of professional intelligence analysis; this is even more so the case in the private sector, as the history of corporate failure painfully shows.

Of these three challenges, effectively communicating the degree and nature of the uncertainty associated with forecasts has received the least attention.

Fortunately, David Mandel has made it his focus. His research is too little known and appreciated outside the intelligence community (and even within it, unfortunately).

By briefly summarizing his research here, we hope Mandel’s work can help far more organizations to improve their forecasting practices and substantially improve their chances of avoiding failure and achieving their goals.


Britten Coyne Partners advises clients how to establish methods, processes, structures, and systems that enable them to better anticipate, accurately assess, and adapt in time to emerging threats and avoid strategic failures. Through our affiliate, The Strategic Risk Institute, we also provide online and in-person courses leading to a Certificate in Strategic Risk Governance and Management.
Comments

Why Did So Many Investors Ignore Warnings Before the Crash of 2021?

This Pre-Mortem was recently published by our affiliate, The Index Investor, which provides global macro research and asset allocation insights.

Tesla. Bitcoin. GameStop. The equity market as a whole. Even the bond markets. The list goes on. Why did so many investors ignore so many warning signs that were flashing red before the Crash of 2021? And more importantly, what was different about investors who did not ignore those warnings?

As always, there were many root causes that amplified each other’s effects.

Let’s start at the individual level.

Tali Sharot’s research has shown how humans have a natural bias towards optimism. We are much more prone to updating our beliefs when a new piece of information is positive (i.e., better than expected in light of our goals) rather than negative (“How Unrealistic Optimism is Maintained in the Face of Reality”).

Individuals seek more information about possible future gains than about possible losses (e.g., “Valuation Of Knowledge And Ignorance In Mesolimbic Reward Circuitry”, by Charpentier et al).

We tend to seek, pay more attention to, and place more weight on information that supports our current beliefs than information that is inconsistent with or contradicts them (known as the confirmation or my-side bias). Moreover, as Daniel Kahneman showed in his book, “Thinking Fast and Slow” this process often happens automatically (“System 1”). When we notice information that is not consistent with our mental model/current set of beliefs about the world, our subconscious first tries to adjust those beliefs to accommodate the new information.

Only when the required adjustment is above a certain threshold, the feeling of surprise is triggered, calling on us to consciously reason about its meaning using “System 2”.

Yet even then, this reasoning is often overpowered by group-level factors.

Having spent so much of our evolutionary existence in a world without writing or math, humans naturally create and share stories rather than formal models to make sense of our uncertain world. Stories are powerful because they have both rational and emotional content; while that makes them easy to remember, it also makes them very resistant to change.

Another group level phenomenon that is deeply rooted in our evolutionary past is competition for status within our group. Researchers have found that when the result of a decision will be private (not observed by others), we tend to be risk averse. But when the result will be observed, we tend to be risk seeking (e.g., “Interdependent Utilities: How Social Ranking Affects Choice Behavior”, by Bault et al).

Other research has found that when we are engaged in social status competition, we actually have less working memory available for reasoning about the task at hand (e.g., “Increases in Brain Activity During Social Competition Predict Decreases in Working Memory Performance and Later Recall” by DiMenichi and Tricomi).

Another evolutionary instinct comes into play when uncertainty is high. Under these conditions, we are much more likely to rely on social learning and copying the behavior of other group members, and to put less emphasis on private information that is inconsistent with or contradicts the group’s dominant view. The evolutionary basis for this heightened conformity is clear – you don’t want to be cast out of your group when uncertainty is high.

It is also the case that groups will often share more than one story or belief at the same time. Research has found that, “as a result of interdependent diffusion, worldviews will emerge that are unconstrained by external truth, and polarization will develop in homogenous populations” (e.g., “Interdependent Diffusion: The Social Contagion Of Interacting Beliefs” by James P. Houghton).

All of these group causes have been supercharged in our age of hyperconnectivity, dense social networks, and multiple media platforms constantly delivering a flood of information.

Finally, individual and group causes are often reinforced by organizational level phenomena.

As successful organizations grow larger, there is a tendency to recruit and promote people who have similar views. Growth also tends to increase the emphasis an organization places on predictable results, which causes them to penalize errors of commission (e.g., false alarms) more heavily than errors of omission (e.g., missed alarms).

Thus employees in larger organizations are very likely to wait longer and require strong evidence before speaking up to warn that danger lies ahead.

In his January 2021 letter to investors (“Waiting for the Last Dance”), GMO’s Jeremy Grantham explained why larger organizations are less likely to warn clients when markets are severely overvalued:

“The combination of timing uncertainty and rapidly accelerating regret on the part of clients [for missing out on gains as the bubble inflates] means that the career and business risk of fighting bubbles is too great for large commercial enterprises…

“Their best policy is clear and simple: always be extremely bullish. It is good for business and intellectually undemanding. It is appealing to most investors who much prefer optimism to realistic appraisal, as witnessed so vividly with COVID. And when it all ends, you will as a persistent bull have overwhelming company. This is why you have always had bullish advice in bubbles and always will."

In sum, that so many suffered large losses when the post-COVID bubble burst should come as no surprise. It was merely the latest version of a plot line that has been repeated for centuries in speculative markets.

The real lessons to be learned come from those investors who reduced their exposure and changed their asset allocations before markets suddenly and violently reversed (analysts are still searching – likely in vain -- for the cause of the market crash. Such is the nature of complex adaptive systems).

What did these investors do differently?

We know one thing they didn’t do – believe that they could personally overcome the very human and deeply rooted evolutionary biases noted above. Research says that odds against success in that endeavor are long indeed.

Rather than trying to conquer their personal biases, these investors established – and followed – investment processes that were designed to offset those biases and their emotionally charged effects.

For example, they didn’t fall prey to the “this time is different” myth, and used traditional valuation metrics to inform their asset allocation decisions. Their default conclusion was that the valuation metrics were right, and demanded very solid, logical and evidence based arguments to reject the signals they sent.

In their own forecasting, they followed best practices. They spent a lot of time making sure they were asking the right questions; they paid attention to base rates; they were disciplined about seeking out high value information to update their views; and they were always alert to surprises that warned their beliefs and models were incomplete.

They also sought out forecasts from a wide range of other sources that were based on different information and/or methodologies, and then combined them to increase their predictive accuracy.

And they focused their forecasting efforts on time horizons beyond the range of the algorithms, where human effort can still produce a profitable edge.

Most important, perhaps, is this timeless truth: The investors who avoided the Crash of 2021 weren’t any smarter than those who were wiped out. They were just more conscious of their own weaknesses, and as a result their investment processes took a more disciplined approach.
Comments

The Fall of the US Capitol: All Three Causes of Strategic Risk Failure Were at Work

The fall of the US Capitol building to a mob of rioters on January 6, 2021 left millions stunned and horrified, and all asking the same question: "How Could This Happen?" Unfortunately, the answer is painfully familiar to those of us who study strategic failure and spend our days helping organizations to avoid it. The initial evidence suggests that all three of the most common root causes of such failures were present in this case too.

The first root cause is failure to anticipate potential future strategic risks, and to recognize them as they begin to emerge.

It is critical to distinguish between three levels of anticipation and warning.

t the strategic level, analysts focus on the "what" and "why" of potential threats. At the operational level, they focus on "how' different threat might materialize. At the tactical level, the focus is on the"who, what, and when" that drive effective response.

The fundamental challenge of threat anticipation and warning is that the possibilities increase exponentially as you move from the strategic to the operational and then to the tactical level.

During April and May 2020, protests against continuing COVID lockdowns erupted across the United States. Many of these were instigated and/or backed by groups on the populist right.

Following the death of George Floyd on May 25, 2020 in the course of his arrest, protest demonstrations were held across the country, with many instigated and/or backed by groups from the populist left. A significant number of these degenerated into rioting.

In September, a draft Department of Homeland Security assessment appeared in the press, which named white supremacist groups as the most dangerous terror threat facing the United States.

On October 8, 2020, the FBI announced the arrest of members of a right wing group who were conspiring to kidnap Michigan Governor Gretchen Whitmer due to anger over lockdowns and other alleged abuses of state government power.

In sum, the growing strategic threat of violent protests by both left and right wing groups was clear.

Operationally, it is certain that the Capitol Police and other intelligence and law enforcement agencies had anticipated the threat posed by a terrorist attack that sought to gain control of the building and harm legislators and staff within it. It is equally certain that they had in place and had frequently rehearsed plans to violently repel a violent attack.

It also is also certain that plans were in place to respond to demonstrations with a high assessed potential to become violent. However, it is not clear that there were operational plans to respond to a demonstration that evolved into an attempt to takeover the capital without the use of weapons and violence, as would be the case in a terrorist attack.

There is accumulating evidence that the Capitol Police had tactical (albeit noisy) warning that the demonstrations planned for January 6th had a significant potential to become violent.

For example, ProPublica reported that, "Capitol Rioters Planned for Weeks in Plain Sight. The Police Weren’t Ready. Insurrectionists made no effort to hide their intentions." Like a growing number of similar reports, this one highlights the amount of detailed information that was available online about plans by some groups to attempt to disrupt the certification of the Electoral College results on January 6th. As ProPublica notes, "The warnings of Wednesday’s assault on the Capitol were everywhere — perhaps not entirely specific about the planned time and exact location of an assault on the Capitol, but enough to clue in law enforcement about the potential for civil unrest."

Similarly, the BBC reported that, "In the days (and indeed weeks and months) before the attack, people monitoring online platforms used by extreme pro-Trump supporters and far-right groups had warned of rhetoric encouraging violence at the Capitol, including toward lawmakers, over the election result. Some were even pictured wearing clothing that said "MAGA: CIVIL WAR" printed alongside the 6 January 2021 date."

ABC News provided more specific information, reporting that, "Three days before supporters of President Donald Trump rioted at the Capitol, the Pentagon asked the US Capitol Police if it needed National Guard manpower."

The second root cause is failure to appropriately and accurately assess the nature, timing, and danger posed by an identified threat.

One clear assessment failure was the apparently very low probability given to a scenario in which people protesting the Electoral College result would attempt to takeover the Capitol building and harm the Vice President and legislators meeting there on January 6th. As the ABC report noted, Capitol Police were only preparing for a "free speech demonstration" of which there are many at the Capitol during the course of any year.

Considering that the last time the security of the Capitol was violently breached was in 1954 (when Puerto Rican terrorists shot down from the visitors gallery on members of the House of Representatives), the Capitol Police's threat assessment failure is depressingly common.

As Thomas Schelling famously noted, “There is a tendency in our planning to confuse the unfamiliar with the improbable. The contingency we have not considered seriously looks strange; what looks strange is thought improbable; what is improbable need not be considered seriously.”

Similarly, a 1983 CIA study found that, "In the [intelligence] estimates that failed, there were a number of recurrent common factors which, in retrospect, seem critical to the quality of the analysis. The most distinguishing characteristics of the failed estimates...was that each involved historical discontinuity and, in the early stages, apparently unlikely outcomes. The basic problem in each estimate was to recognize qualitative change and to deal with situations in which trend continuity and precedent were of marginal, if not counterproductive, value.”

The third root cause is a failure to adapt in time to a developing threat.

Another classic cause of strategic failure is a poor grasp of the interacting and usually non-linear time dynamics that are at work. Specifically, organizations typically overestimate the time that remains before a developing threat passes a critical threshold, while also overestimating how quickly they can develop and implement an effective response to it. In short, their remaining "safety margin" is often smaller and shrinking more quickly than they realize.

The available evidence suggests this source of failure almost certainly contributed to the fall of the Capitol on January 6th. For example, the AP reported that, "As the mob descended on the building Wednesday, Justice Department leaders reached out to offer up FBI agents. The [Capitol] police turned them down." While some of this may have been due to inter organizational rivalry and a desire on the part of the Capitol Police to avoid embarrassment, it is almost certain that if Capitol Police commanders had an accurate awareness situation's time dynamics, they would have accepted this offer of aid.

Nor were preparations in place to coordinate and control rapid adaptation if the demonstrations at the Capitol degenerated into rioting, as they did.

For example, Marketwatch reported that, "Army Secretary Ryan McCarthy said that as the rioting was underway, it became clear that the Capitol Police were overrun. But he said there was no contingency planning done in advance for what forces could do in case of a problem at the Capitol because Defense Department help was turned down."

No interagency command structure was established to coordinate tactical intelligence collection and fusion, and direct different agencies' response to the rapidly deteriorating situation at the Capitol.

In sum, while the actual events at the US Capitol on January 6th were unique, the underlying root cause of the strategic failure it represents were depressingly familiar.


Britten Coyne Partners advises clients how to establish methods, processes, structures, and systems that enable them to avoid strategic failures. Through our affiliate, The Strategic Risk Institute, we also provide online and in-person courses leading to a Certificate in Strategic Risk Governance and Management.







Comments