Monday, February 6, 2017

Misinterpreting statistical averages — we all do it

Looking at a mass of data is often confusing at best and daunting at worst. So, our natural tendency is to summarize the data in some way, thus reducing the mass down to something more manageable. This is usually considered to be A Good Thing, but it does run the risk of being misleading.

A summary must lose information, by definition (that's what "reduction" means). What if the lost information causes us to misinterpret the summary? A summary cannot be perfect, and so our interpretation cannot be perfect, either. We need to summarize the important part of the data, and not all summaries are created equal.

Perhaps the biggest potential problem occurs when we calculate an average, as our data summary. People then seem to focus very much on the average itself, and not on the variation of the original data around that average. This can lead to misinterpretations of the original data.

Here, I will use a few examples from the world of wine data to illustrate this point. In fact, I will show that apparent patterns in data can arise from changes in either the average or the variation, or both. We need to be aware of this in practice.

Introduction

Consider this first graph, which shows the differences in the average quality ratings of the wines from the different types of Bordeaux chateaux (first to fifth growths). We immediately see a pattern of decreasing average score as we proceed from the first to the third growths, which then do not differ much from the fourth and fifth growths.

Wine quality of Bordeaux chateaux

Each point in the graph represents an average of the quality scores from a number of chateaux, but we cannot see the variation of these scores (we have only the averages). So, is the apparent decreasing pattern among the points caused by differences in average alone or differences in variation?

There are three ways that averages can show a decreasing pattern:
  • all of the data points decrease
  • the larger values become fewer
  • the data become less variable
Obviously, the inverse of each of these must be true for increasing patterns. In the next three sections I will show an example of all three possibilities.

Increase in average because all of the data points increase

This next graph shows the pattern through time of the monetary worth of alcohol exports from Australia, from 1988 to 2017. The original data are the small black points, connected by a black line.

Australian alcohol exports through time

It is clear that there is variation within and between years, and indeed this variation increases through time. It is common to summarize this type of time pattern with a running average, as shown by the thick red line — and this helps "smooth out" the pattern by averaging adjacent groups of data points. This summary is simple to interpret in this case, because all of the data points follow roughly the same pattern — they increase through time.

In this case, the summary is not misleading. I could delete the black points and line, and the red line would still be a good representation of the original data. So, presenting only the data summary would a good way to simplify the data pattern, in this case.

Increase in average because the smaller values become fewer

Now lets look at a potentially misleading case.

This next graph shows the pattern through time of the vintage quality scores of Bordeaux red wine, from 1934 to 2010. Quality is measured on a 20-point scale. The original data are the blue squares, while the red line a a running average.

Quality of Bordeaux vintages through time

The data summary (running average) shows a general upward trend, which we might interpret as a general increase through time of the quality of Bordeaux red wines, particularly since the early 1970s.

But we would be wrong — that's not what the data show. The pattern in the data is that the lowest data points "disappear" as we move from left to right across the graph. This is emphasized by the added box in this next version of the graph — there are no data points within this box, but data points do occur immediately to the left of the box.

Quality of Bordeaux vintages through time

So, the original data show that there were no vintages with a score of less than 10 out of 20 from the 1970s onward, but there were quite a few such vintages before that. The higher average wine quality thus arises because the poor vintages no longer occur, not because the quality of the other vintages increases. Top quality vintages occurred before and after 1970, but poor quality vintages died out.

In this case, the summary is misleading. That is, the red line on its own would not be a good summary of the data. I cannot usefully delete the blue points and black line.

Increase in average because the data become more variable

Averages can mislead in another way, as well.

The next graph also concerns Bordeaux red wines. Each point represents a vintage (from 1940 to 1995), with the vintage quality score shown horizontally (scale 1-7) and the vintage volume (in hectoliters) shown vertically.

Quality and quantity of Bordeaux vintages

A data summary would show a general upward trend across the graph from left to right, which we might interpret as a general increase in production as wine quality increases. This particular interpretation has certainly appeared in the literature on wine production.

But we would be wrong — that's not what the data show. The pattern in the data is that for vintages with low quality there is little wine production, but for high-quality vintages the production volume can vary dramatically. This is emphasized by the added line in this next version of the graph — there are few data points above the line, which would represent poor-quality vintages with a big production.

Quality and quantity of Bordeaux vintages

So, we rarely get big production from poor-quality vintages. This means that the apparent pattern (that there is higher average wine production as vintage quality increases) occurs because high-quality vintages can be associated with big production but poor-quality vintages cannot. Vintage production does not increase with vintage quality. Instead, variation in production increases with vintage quality — production may be big or small when the quality is high, but it is usually small when quality is low.

In this case, the summary would be misleading. It is thus a good thing that no summary line is shown on the graph.

Conclusion

Any time we are looking at a data summary, we need to bear in mind that apparent patterns in that summary can be caused by any one of three underlying patterns in the data, as illustrated above. These different causes lead to different interpretations of the summary. Be wary of data summaries when you see them, unless you can also see the original data, as well.

Data sources

First graph:
Gary M. Thompson, Stephen A. Mutkoski, Youngran Bae, Liliana Ielacqua, Se B. Oh (2008) An analysis of Bordeaux wine ratings, 1970-2005: implications for the existing classification of the Médoc and Graves. Cornell Hospitality Report 8(11): 6-17.

Second graph:
Wikimedia commons

Third pair of graphs is modified from:
Pablo Almaraz (2015) Bordeaux wine quality and climate fluctuations during the last century: changing temperatures and changing industry. Climate Research 64: 187-199.

Fourth pair of graphs is modified from:
Gregory V. Jones, Robert E. Davis (2000) Climate influences on grapevine phenology, grape composition, and wine production and quality for Bordeaux, France. American Journal of Enology and Viticulture 51: 249-261.