Monday, May 29, 2017

How many wine-quality scales are there?

There are a number of ratings systems for describing wine quality, which use 100 points, 20 points, 5 stars, 3 glasses, etc. Unfortunately, there is usually no "gold standard" for these systems, and so no two wine commentators use these systems in quite the same way.

That is, when critics differ in their wine scores for a particular wine, it can be for one of two reasons: (i) their opinions on the wine's quality differ, or (ii) they are expressing their opinion using different numbers. That is, when the critics produce the same score, they may or may not be assessing the wine as having the same quality, and similarly when they produce different scores. Each critic has their own personal version of the "100-point scale" or the "20-point scale".


This situation is similar to people speaking different languages. Simply looking at a word does not necessarily tell you what language is being used, because the same combination of letters can occur in different languages, with or without the same meaning. For example, the word "December" appears in both Swedish and English, and in this case it has the same meaning in both languages. However, the word "sex" also appears in both languages, but in Swedish it usually refers to the number 6, which is not necessarily related to any of the word's possible meanings in English.

So, if the Wine Spectator gives a wine 90 points, does that mean the same thing as when the Wine Advocate gives that same wine 90 points? Probably not. Just for variety, instead of using the 100-point scale to illustrate this topic, I will use the 20-point scale for wine quality — this emphasizes the need to translate the ratings systems to a common one.

20-point ratings systems

Many American wine drinkers are familiar with the 20-point scale developed in the 1950s by Maynard Amerine and his colleagues at the University of California, Davis, intended as a teaching tool for identifying faulty wines. This was, indeed, an attempt to produce a "gold standard" wine rating system. Each organoleptic characteristic of the wine is assigned a number of points based on its perceived quality, and these points are summed to produce the final score. In both theory and practice, everyone who uses the UCDavis scale should be "speaking the same language"; and therefore any differences in wine scores should represent differences in wine quality, not differences in language.

Sadly, not everyone has agreed with or used the UCDavis scale, especially as a general tool for wine tastings; this topic is discussed in detail in recommended books such as those by Clive S. Michelsen (Tasting and Grading Wine. 2005) and Andrew Sharp (Winetaster's Secrets. 2005). So, there are innumerable 20-point scales in use around the world, and they all seem to represent different languages. To illustrate the range of scales in use, we can compare the scores given to the same wines by different critics.

In order to standardize the scales for direct comparison, we need to translate the different languages into a common language. Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180. 2015) have suggested doing this by converting the different scales to a single 100-point scale. The one they chose was the scale used by the Wine Advocate (which is not necessarily the same as that used by the Wine Spectator, or the Wine Enthusiast, etc), and I will do the same here. Furthermore, I will compare the quality scales based on their scores for the five First Growth red wines of the Left Bank of Bordeaux (as described in the post How large is between-critic variation in quality scores?).

The scales for five different commentators are shown in the first graph. The original scores are shown on the horizontal axis, while the standardized score is shown vertically. The vertical axis represents the score that the Wine Advocate would give a wine of the same quality. If the critics were all speaking the same language to express their opinions about wine quality, then the lines would be sitting on top of each other; and the further apart they are, the more different are the languages.

Five different 20-point wine-quality ratings systems

Also shown is the difference in meaning for a wine that gets a score of 18 from each of the critics. If we see a wine score of 18, then La Revue du Vin de France, Jean-Marc Quarin and Bettane et Desseauve mean a somewhat better wine than does Jancis Robinson. On the other hand, Vinum Weinmagazin is indicating a somewhat worse wine. They are, indeed, all speaking different languages; and we readers need to translate between these languages in order to get their meaning.

As another example, at the end of June 2012 Decanter magazine changed from using a 20-point ratings scale to a 100-point scale (see New Decanter panel tasting system). In order to do this, they had to convert their old scores to the new scores. They used a conversion that is precisely halfway between the scoring systems of Jancis Robinson and Bettane & Desseauve, as shown in the next graph (see How to convert Decanter wine scores and ratings to and from the 100 point scale). So, this is yet another different 20-point language.

Seven different 20-point wine-quality ratings systems

So far, I have assumed that there is a linear relationship between the scores from the different critics (ie. the graph lines are straight). However, in an earlier post (Two centuries of Bordeaux vintages) I suggested that the relationship between the Bordeaux scores from Tastet & Lawton and from Jeff Leve (the Wine Cellar Insider) is curved, instead. Indeed, The World of Fine Wine magazine explicitly indicates that their 20-point scoring system is non-linear, as shown in the second graph above. This makes for a very complex language translation, indeed.

As we shall see in the next post (How many 100-point wine-quality scales are there?), translating between 20- and 100-point scales is not straightforward, either.

Conclusions

The short answer to the question posed in the title is: pretty much one for each commentator. Fortunately, there are not quite as many wine-quality rating systems as there are languages. Nevertheless, the idea of translating among them is just as necessary in both cases, if we are to get any meaning.

Does all of this matter in practice? Quite definitely. Indeed, every time a wine retailer plies us with a combination of critics' scores, we have to translate those scores into a common language, in order to work out whether the critics are agreeing with each other or not. Since most of us are not doing this, we may well be fooling ourselves into seeing a false sense of agreement among those critics. The world of fine wine is more complex than most people realize, or would like.

Furthermore, this issue is at the heart of the objections that mathematicians have to simply averaging wine scores across different critics. If the critics are all using different ratings scales, then the average score has no mathematical meaning. That is, if the critics are speaking different languages, then what would the "average" of those languages mean? It would be gibberish, unintelligible to anyone, even if the combination of letters looks like it might be a real word. A classic example of this is the Judgment of Paris, from 1976, in which the "official" summed scores are meaningless, because the tasters were all using different versions of the 20-point scale (see A Mathematical Analysis of The Judgment of Paris). Note also, that the scores using the UCDavis scale are much higher than are the scores for the Judgment (see Was the Judgment of Paris repeatable?).

8 comments:

  1. Nice to see that you have discovered Andrew Sharp's book.

    "An extremely well written book with the most informative and perceptive chapters on wine tasting I have read.
    This is the finest book for both beginners and serious wine collectors about the actual tasting process -- lively, definitive and candid."

    ~~ Robert Parker book review

    ReplyDelete
  2. From Jancis Robinson, MW Website
    (circa 2002):

    “How to Score Wine”

    Link: http://www.jancisrobinson.com/articles/how-to-score-wine?layout=pdf

    I would be much happier in my professional life if I were never required to assign a score to a wine.

    . . .

    Even I have to admit, however, that scores have their uses. . . . however much we professionals may feel our beloved liquid is too subtle to be reduced to a single number.

    I find myself using all sorts of different scoring systems depending on the circumstances. . . .

    In most of my tasting and writing I don't really need scores. . . .

    I like the five-star system used by Michael Broadbent and Decanter magazine. Wines that taste wonderful now get five stars. Those that will be great may be given three stars with two in brackets for their potential. . . .

    I know that Americans are used to points out of 100 from their school system so that now they, and an increasing number of wine drinkers around the world, use points out of 100 to assess wines. Like many Brits, I find this system difficult to cope with, having no cultural reference for it.

    So, I limp along with points and half-points out of 20, which means that the great majority of wines (though by no means all) are scored somewhere between 15 and 18.5, which admittedly gives me only eight possible scores for non-exceptional wines -- an improvement on the five star system but not much of one. (I try when tasting young wines to give a likely period when the wine will be drinking best, so I do cover the aspect of its potential for development.)

    ReplyDelete
  3. From Wine Spectator "Letters" Section
    (March 15, 1994, Page 90):

    Grading Procedure

    In Wine Spectator, wines are always rated on a scale of 100. I assume you assign values to certain properties of the wines (aftertaste, tannins for reds, acidity for whites, etc), and combined they form a total score of 100. An article in Wine Spectator describing your tasting and scoring procedure would be helpful to all of us.

    (Signed)

    Thierry Marc Carriou
    Morgantown, N.Y.


    Editor’s [reply] note: In brief, our editors do not assign specific values to certain properties of a wine when we score it. We grade it for overall quality as a professor grades an essay test. We look, smell and taste for many different attributes and flaws, then we assign a score based on how much we like the wine overall.

    ReplyDelete
    Replies
    1. Addendum.

      Reiterating . . . Wine Spectator has no 100 point wine scale based on "components":

      "there really is no [100 point wine] score sheet. ...

      Gloria Maroti Frazee
      director of education -- and video
      Wine Spectator

      (posted May 24, 2006)"

      Source: http://forums.winespectator.com/eve/forums/a/tpc/f/456102303/m/901103173

      Delete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. Part One of Two:

    Excerpts from Wine Times [later renamed Wine Enthusiast] (September/October 1989) interview
    with Robert Parker, publisher of The Wine Advocate

    WINE TIMES: How is your scoring system different from The Wine Spectator's?

    PARKER: Theirs is really a different animal than mine, though if someone just looks at both of them, they are, quote, two 100-point systems. Theirs, in fact, is advertised as a 100-point system; mine from the very beginning is a 50-point system. If you start at 50 and go to 100, it is clear it's a 50-point system, and it has always been clear. Mine is basically two 20-point systems with a 10-point cushion on top for wines that have the ability to age. . . .

    . . . The newsletter was always meant to be a guide, one person's opinion. The scoring system was always meant to be an accessory to the written reviews, tasting notes. That's why I use sentences and try and make it interesting. Reading is a lost skill in America. There's a certain segment of my readers who only look at numbers, but I think it is a much smaller segment than most wine writers would like to believe. The tasting notes are one thing, but in order to communicate effectively and quickly where a wine placed vis-à-vis its peer group, a numerical scale was necessary. If I didn't do that, it would have been a sort of cop-out.

    I thought one of the jokes of the 20-point systems is that everyone uses half points, so it's really a 40-point system -- which no one will acknowledge -- and mine is a 50-point system, and in most cases a 40-point system.

    WINE TIMES: But how do you split the hairs between an 81 and an 83?

    PARKER: It's a fairly methodical system. The wine gets up to 5 points on color, up to 15 on bouquet and aroma, and up to 20 points on flavor, harmony and length. And that gets you 40 points right there. And then the [ balance of ] 10 points are . . . simply awarded to wines that have the ability to improve in the bottle. This is sort of arbitrary and gets me into trouble.

    WINE TIMES: You mean when you are in the cellars of Burgundy, you look at a wine and say this is a 4 for color, a 14 for bouquet, and so on [ ? ]

    PARKER: Yes, most of the times. What happens is that I've done so many wines by now that I know virtually right away that it's, say, upper 80s, and you sort of start working backwards. And color now is sort of an academic issue. The technology of color is refined and most color is fine. My system applies best to young wines because older wines, once they've passed their prime, end up getting lower scores.

    . . .

    ReplyDelete
    Replies
    1. Part Two of Two:

      Excerpts from Wine Times [later renamed Wine Enthusiast] (September/October 1989) interview
      with Robert Parker, publisher of The Wine Advocate

      WINE TIMES: Do you have a bias toward red wines? Why aren't white wines getting as many scores in the upper 90s? Is it you or is it the wine?

      PARKER: Because of that 10-point cushion. Points are assigned to the overall quality but also to the potential period of time that wine can provide pleasure. And white Burgundies today have a lifespan of, at most, a decade with rare exceptions. Most top red wines can last 15 years and most top Bordeaux can last 20, 25 years. It's a sign of the system that a great 1985 Morgon [ cru Beaujolais ] is not going to get 100 points because it's not fair to the reader to equate a Beaujolais with a 1982 Mouton-Rothschild. You only have three or four years to drink the Beaujolais.

      WINE TIMES: In your system, what would be the highest rated Beaujolais?

      PARKER: 90. That would be a perfect Beaujolais, and I've never given one. I have given a lot of 87s and 88s.

      [Bob Henry's comment: In 1990, Parker awarded a score of 92 points to the 1989 vintage Georges Duboeuf "Jean Descombes" Morgon Beaujolais, contradicting his then year-old statement above.]

      WINE TIMES: So it's the aging potential that is the key factor that gets a wine into the 90s.

      PARKER: Yes. And it goes back to how I evaluate vintages in general. To me the greatness of a vintage is assessed two ways: 1) its ability to provide great pleasure -- wine provides, above all, pleasure; 2) the time period over which it can provide that pleasure.

      . . .

      Delete
  6. Further complicating David's conversion of one scoring scale to another is the internal inconsistency of Robert Parker's scale because . . . he has TWO scales.

    One for wines that don't improve with bottle age (maximum score of 90 points).

    One for wines capable of improving with bottle age (maximum score 100 points . . . the extra 10 points assigned to longevity).

    Citing his own example, a cru Beaujolais that scores 90 points (circa 1989 interview) would be considered "perfect."

    But "perfect" is not 100 points.

    (As history reveals, when the 2009 vintage cru Beaujolais were reviewed, Wine Advocate's highest score was 94 points.

    Begging the question: "What's MORE perfect than 'perfect'?"

    ReplyDelete