Monday, January 14, 2019

The fundamental problem with wine scores

I have written several blog posts about wine-quality scores, pointing out that even though they are expressed as numbers they do not have many useful mathematical properties; and, to me, a score with no mathematical meaning is like trying to construct a Swedish sentence by knowing the words but not the grammar. However, what I have not done, until now, is point out the fundamental issue that leads to this situation in the first place. That is, I have previously pointed out effects, but not causes.

Before proceeding to discuss the cause, however, I will point out that many wine commentators seem to treat wine scores as nothing more than a convenient way to express their own personal preferences (ie. increasing score indicates increasing preference). Under these circumstances the scores have nothing to do with mathematics, at all. Preferences could just as easily be expressed with words; and in this case they probably should be. They certainly used to be, before the 1990s, and for some commentators they still are.


The basic issue

Put formally, wine scores represent multidimensional properties that have been summarized as a single point in one dimension.

Sounds good, doesn't it? Let's put it another way: the single wine-quality number is trying to do too many things all at once.

Whenever a critic tells us how they construct their scoring scheme, they usually list a series of characteristics of wines that purportedly contribute to quality (mainly based on color, aroma, palate and body). Formally, each of these characteristics is a "dimension" of any given wine's quality.

Here is an example, taken from Steve Charters and Simone Pettigrew (2007. The dimensions of wine quality. Food Quality and Preference 18: 997-1007).

The dimensions of wine quality

In terms of quality, most commentators are interested solely in the intrinsic dimensions. However, in order to describe a wine mathematically, we would need a number for each of these intrinsic dimensions. Given this collection of numbers, we would then have a complete description of any given wine's quality.

The situation

As a prime example, take the original UCDavis wine scoring system, which covers the score range 0-20.** The characteristics of quality and their associated numbers are:
Dimension
Appearance
Color
Aroma & bouquet
Volatile acidity
Total acidity
Sweetness
Body
Flavor
Bitterness
General quality
Score
2
2
4
2
2
1
1
2
2
2

There are 11 dimensions here, and we need all 11 numbers to completely describe any given wine's quality. That is, wine quality is multi-dimensional, and we need to "see" all of those dimensions in order to evaluate the wine.

However, rather than doing this, the UCDavis system summarizes the wine down to a single number — in this case, we add the numbers for each dimension, to get a score out of 20. That is, we reduce the multi-dimensional idea of quality down to a single point in one dimension — that dimension simply goes from 0 to 20, and the point on that dimension is the quality score.

The ensuing problem

The problem that arises from this situation actually applies any time we reduce a multi-dimensional concept down to a single dimension. I encountered this issue many times in my professional life as an environmental and evolutionary biologist,* so there is nothing unique about the situation as it arises in wine commentary.

The problem is this: many quite-different wines could end up with the same final score. Summarizing a set of numbers down to a single number must, by definition, lose most of the numerical information (the multiple dimensions become one dimension only). If a wine gets a score of 0, then we know the score for each dimension (it must be 0 in each case), and we have lost no information. The same applies for a wine that scores 20, as this must mean that the wine got the maximum score for each dimension. But for all other scores the situation is ambiguous.

Consider these two wines, which I have described using the 11 UCDavis dimensions listed above:
2 + 2 + 2 + 2 + 2 + 0 + 1 + 1 + 2 + 1 = 15
2 + 2 + 4 + 1 + 1 + 1 + 1 + 2 + 0 + 1 = 15
These would be two very different wines; but I would never know it from the final quality score.

So, you should now see why wine quality scores have a fundamental problem, if we try to treat them as mathematical concepts: how do we interpret the quality score? We have no way of knowing what the score represents in terms of the multi-dimensional concept of wine quality. Two identical scores could easily represent two very different wines.


A problem for all ratings systems

The problem discussed here is general. All ratings systems are one-dimensional, while the data on which they are based are multi-dimensional. A linear rating system makes no sense when you are combining different characteristics — we cannot combine multiple features into a single number in any way that makes much sense. That is, when we look at the final rating score we cannot tell which characteristics were important in producing it.

Take this simple situation, where value for money has two dimensions, quality and price:
A (high quality) a (expensive)
A (high quality) b (inexpensive)
B (low quality)  a (expensive)
B (low quality)  b (inexpensive)
How could I sensibly put these four groups in a single order based on value for money? We know which group is likely to be the best value for money, and we might put this at the top; and we know which is the worst value for money (Ab), and we might put this at the bottom (Ba); but what do we do with Aa and Bb in terms of value for money? If we did put them in some order, we would be doing so solely for the sake of doing so, not because it would be informative.

We have two totally different criteria, and combining them vitiates any attempt at a single order. The only system that would make sense would be multi-dimensional. That is, we should keep the ratings as Aa, Ab, Ba and Bb — the categories would this have meaning even though their order does not.

This is very similar to America's Got Talent, where the judges are trying to compare a magician with a pole dancer, and deciding which is "better". Better at what? Both of them are very good with their hands, but in very different ways! No wonder most of these shows worldwide end up being won by singers.

Wine shows

So, the issue for wine-quality ratings should now be clear. The ratings are based on trying to combine a series of different characteristics, some of which are very different from each other.

This explains why a wine can win a gold medal at one show and nothing at all at the next. The judges were combining the different quality dimensions in different ways, and thereby deciding which is best — that is all that the wine shows tell us.

The wine shows try to alleviate the problem a bit, by having a lot of different categories, based on all sorts of features (grape variety, wine style, vintage age, etc). This certainly helps, but it brings us back to the same problem of comparing two bottles of wine based on a series of vinous characteristics that are very hard to combine into a single number. And this approach certainly does not help at all with "best wine in show" awards.

A solution?

I have discussed multi-dimensional data previously in this blog. I pointed out at the time that, if we are going to take the numbers seriously, then we actually need to draw graphs of them, not reduce them to a single number:
Summarizing multi-dimensional wine data as graphs, Part 1: ordinations
Summarizing multi-dimensional wine data as graphs, Part 2: networks
It is difficult seeing the wine-buying public going for this solution, but I might discuss it in a future post.

An alternative solution?

It has sometimes been claimed that a wine score is not a number, but is more like an adjective. Well, it sure looks like a number to me, so this simply exacerbates the problem. If it is an adjective then it should be a word, not a number. I will discuss this in my next post, but as a preview: it still takes multiple words to describe all aspects of a wine's quality, and summarizing this in a word or two does not change anything — we are still summarizing multiple dimensions (expressed as words, this time) into one dimension (a small set of words).



* For example, in ecology Species Diversity is measured as a combination of two dimensions: (1) a count of the number of species, and (2) the abundance of each species. These two concepts are combined into a single number.

** Here is a more detailed overview of the UCDavis scoring scheme, taken from George Vierra (A better wine scorecard?).


Monday, January 7, 2019

The rise of the USA as the world's biggest wine consumer

Following last week's cautionary post (Is there truth in wine numbers?), we can now contemplate a few wine numbers about the USA and its position in the wine world.

Any given country's wine consumption is a product of the amount of wine each (adult) person consumes per year and the number of people in that country. To be No. 1, a country can either have a lot of people or they can each consume a lot of wine, or both. The USA is the third most populous country on the planet, after China and India, neither of which consumes a lot of wine per person (yet).

So, the idea that the USA is the No. 1 wine consumer is not unexpected. However, the question is when did it become No. 1? As the first graph shows, this event did not occur until very recently.

Top wine-consuming countries 1865-2014

The data are taken from Global Wine Markets, 1860 to 2016: a Statistical Compendium, compiled by Kym Anderson, Signe Nelgen and Vicente Pinilla. The graph covers the years 1865-2014 (horizontally), showing the estimated percentage of global wine consumption (vertically) for those countries that either currently account for >5% of the consumption or have accounted for >10% at some time in the past.

The graph makes it very clear that the patterns of wine consumption have changed dramatically over the past 150 years for many countries. This is not a result of changing population sizes (which have all grown), but instead reflects changes in per person (per capita) wine consumption. These changes are shown in the second graph.

Top wine-consuming countries per capita 1865-2014

Both France and Spain have shown a slow decline in consumption per person since the 1950s, although France also had particularly notable dips during both World War I and WWII. Italy's decline per person actually dates from the 1960s, although there were erratic changes in consumption during WWI. By contrast, Germany's rise in consumption dates from the end of WWII. The rise in per person consumption in the USA dates from the end of Prohibition, not unexpectedly. The only major wine-consuming country missing from the second graph is Portugal, which actually showed an increase in per person consumption until the end of WWII, followed by a slow decline starting in the mid 1970s.

So, the declining consumption in France and Italy combined with the rising consumption in the USA, finally resulting in the US taking the global lead in total consumption from the year 2010 onwards.

If you want to see some forecasts for the possible future of US wine consumption, they are discussed in 2019 U.S. alcohol consumption to increase while population growth stagnates.

Also shown on the second graph above (as the dashed line) is the consumption for Croatia, which had the globally highest per capita rate in 2014. Indeed, the per person consumption in Croatia has remained steady since the 1980s, unlike the western European countries discussed so far. Spanish per person consumption dropped below the current (Croatian) maximum way back in 1984, followed by the Italians in 2005, and the French and Portugese in 2013. The other two countries who currently match the Croatians, Portugese, French and Italians in their love of wine are the Moldovians and the Swiss.

Finally, it is worth illustrating just how far out in front the top three countries are, in terms of per person consumption. The final graph shows the per capita consumption (vertically) for the top 37 countries (ranked horizontally).


As you can see, all of the countries fit a nice simple (linear) mathematical model except the top three, where the people are still consuming far more wine per person than anywhere else.