Interpreting uncertainty through data visualizations

In Brief
- Data-visualization techniques can clarify the uncertainty in information or make it more confusing if not implemented well.
- For example, even though error bars seem exact, people often misunderstand them.
- Quantile dot plots can be effective because they present uncertainty as discrete probabilities that people can readily see.
- Effective visualizations of uncertainty can help us make judgments, analytically and emotionally, about the likelihood of future events.

Uncertainty pervades the data that scientists and all kinds of organizations use to inform decisions. Visual depictions of information can help clarify the uncertainty—or compound confusion. Ideally, visualizations help us make judgments, analytically and emotionally, about the probability of different outcomes. Abundant evidence on human reasoning suggests, however, that when people are asked to make judgments involving probability, they often discount uncertainty. As society increasingly relies on data, graphics designers are grappling with how best to show uncertainty clearly.
What follows is a gallery of visualization techniques for displaying uncertainty, organized roughly from less effective to more effective. Seeing how different approaches are chosen and implemented can help us become more savvy consumers of data and the uncertainty involved.
NO QUANTIFICATION
The least effective way to present uncertainty is to not show it at all. Sometimes data designers try to compensate for a lack of specified uncertainty by choosing a technique that implies a level of imprecision but does not quantify it. For example, a designer might map data to a visual variable that is hard for people to define, such as a circle floating in space rather than a dot on a graph that has x and y axes. This approach makes the reader’s interpretation more error-prone. Alternatively a designer might use a program that creates a hand-drawn or “sketchy” feel. Both approaches are risky.

INTERVALS
Intervals may be the most common representations of quantified uncertainty. Error bars and confidence envelopes are widely recognized, but even though they seem exact and straightforward, they are notoriously hard to interpret properly. Research shows they are often misunderstood, even by scientists.

PROBABILITY DENSITY MAPS
Designers can map uncertainty directly to a visual property of the visualization. For example, a gradient plot can shift from dark color (high probability) at the center to lighter color (low probability) at the edges. In a violin plot, wider points mean greater probability. Mapping probability density to a visual variable displays uncertainty in greater detail than interval methods (error bars and confidence envelopes), but its effectiveness depends on how well readers can perceive differences in shading, height or other visual properties.

ARRAYS OF ICONS
Reframing a probability such as 30 percent as a frequency—three out of 10—can make it easier for people to understand uncertainty and consequently use such information appropriately. People may better understand discrete probabilities because they run into them in everyday experiences.

MULTIPLE SAMPLES IN SPACE
Plotting of multiple samples in space can be used to show probability in a discrete format for one or more variable quantities. One example of this approach is a quantile dot plot. It shows a number of distinct cases from the quantiles of the data distribution, so that the number of dots (such as two dots high or five dots high, in the example below) conveys probability. When there is uncertainty about parameter values from which estimates are drawn, such as initial conditions, samples can be generated that vary these parameters and can be shown in a single visualization.
.png)
MULTIPLE SAMPLES IN TIME
Plotting multiple possible outcomes as frames in an animation makes uncertainty visceral and much harder to ignore. This technique, called hypothetical outcome plots, can be used for simple and complex visualizations. Perceptual studies indicate that people are surprisingly adept at inferring the distribution of data from the frequency of occurrences: we do not necessarily need to count the number of times an event occurs to estimate its probability. One important factor is the speed of events, which must be fast enough so that people can see a sufficient number of samples yet slow enough for them to consciously register what they saw.

HYBRID APPROACHES
Designers can create effective uncertainty visualizations by combining different techniques rather than choosing a standard chart “type.” One example is a fan chart, made famous by the Bank of England. It depicts data up to the present, then projections into the future; uncertainty about the past is an important component in assessing uncertainty about the future. The fan chart presents probability from higher chance to lesser chance in multiple bands that represent different levels of confidence, which the reader can choose from. Readers can perceive the information through the position of the edges of the bands, as well as lightness versus darkness. Some modern software packages for statistical graphics and modeling make it easy to combine uncertainty visualization approaches.
.png)
This article was originally published with the title “Confronting Unknowns” in Scientific American 321, 3, 80-83 (September 2019)
doi:10.1038/scientificamerican0919-80
MORE TO EXPLORE
Picturing the Uncertain World: How to Understand, Communicate, and Control Uncertainty through Graphical Display. Harold Wainer. Princeton University Press, 2009.
Visualizing Uncertainty. Claus O. Wilke in Fundamentals of Data Visualization. O’Reilly Media, 2019.
Uncertainty Visualization, Explained. Blog series by Jessica Hullman and Matthew Kay. https://medium.com/multiple-views-visualization-research-explained
FROM OUR ARCHIVES
Saving Big Data from Itself. Alex “Sandy” Pentland; August 2014.
ABOUT THE AUTHOR(S)
Jessica Hullman
Jessica Hullman is a professor of computer science and journalism at Northwestern University. She and her research group develop and evaluate data-visualization and data-interaction techniques to enhance reasoning about uncertainty.
Credit: Nick Higgins