Monday, June 26, 2017

What happened to Decanter when it changed its points scoring scheme

In a previous post (How many wine-quality scales are there?), I noted that at the end of June 2012 Decanter magazine changed from using a 20-point ratings scale to a 100-point scale for its wine reviews (see New Decanter panel tasting system). In order to do this, they had to convert their old scores to the new scores (see How to convert Decanter wine scores and ratings to and from the 100 point scale).

It turns out that there were some unexpected consequences associated with making this change, which means that this change was not as simple as it might seem. I think that this issue has not been appreciated by the wine public, or probably even the people at Decanter, either; and so I will point out some of the consequences here.


We do expect that a 20-point scale and a 100-point scale should be inter-changeable in some simple way, when assessing wine quality. However, there is actually no intrinsic reason why this should be so. Indeed, Wendy Parr, James Green and Geoffrey White (Revue Européenne de Psychologie Appliquée 56:231-238. 2006) actually tested this idea, by asking wine assessors to use both a 20-point scale and a 100-point scale to evaluate the same set of wines. Fortunately, they found no large differences between the use of the two schemes, for the wines they tested.

This makes it quite interesting that when Decanter swapped between its two scoring systems it did seem to change the way it evaluated wines. This fact was discovered by Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180), in 2015, when they looked at the scores for the red wines of Bordeaux.

Cardebat & Paroissien looked at how similar the quality scores were for a wide range of critics, and then compared them pairwise using correlation analysis. If all of the scores between any given pair of critics were closely related then their correlation value would be 1, and if they were completely different then the value would be 0; otherwise, the values vary somewhere in between these two extremes. Cardebat & Paroissien provide their results in Table 3 of their publication.

Of interest to us here, Cardebat & Paroissien treated the Decanter scores in two groups, one for the scores before June 2012, which used the old 20-point system, and one for the scores after that date, which used the new 100-point system. We can thus directly compare the Decanter scores to those of the other critics both before and after the change.

I have plotted the correlation values in the graph below. Each point represents the correlation between Decanter and a particular critic  — four of the critics have their point labeled in the graph. The correlation before June 2012 is plotted horizontally, and the correlation after June 2012 is plotted vertically. If there was no change in the correlations at that date, then the points would all lie along the pink line.

Change in relationship to other critics when the scoring system was revised

For two of the critics (Jeff Leve and Jean-Marc Quarin), there was indeed no change at all, exactly as we would expect if the 20-point system and 100-point system are directly inter-changeable. For seven other critics the points are near the line rather than on it (Tim Atkin, Bettane & Desseauve, Jacques Dupont, René Gabriel, Neal Martin, La Revue du Vin de France, Wine Spectator), and this small difference we might expect by random chance (depending, for example, on which wines were included in the dataset).

For the next two critics (Robert Parker, James Suckling), the points seem to be getting a bit too far from the line. At this juncture, it is interesting to note that the majority of the points lie to the right of the line. This indicates that the correlations between Decanter and the other critics were greater before June 2012 than afterwards. That is, Decanter started disagreeing with the other critics to a greater extent after they adopted 100 points than before; and they started disagreeing with Parker and Suckling even more than the others.

However, what happens with the remaining two critics is quite unbelievable. In the case of Jancis Robinson, before June 2012 Decanter agreed quite well with her wine-quality evaluations (correlation = 0.63), although slightly less than for the other critics (range 0.63-0.75). But afterwards, the agreement between Robinson and Decanter plummeted (correlation = 0.36). The situation for Antonio Galloni is the reverse of this — the correlation value went up, instead (from 0.32 to 0.56). In the latter case, this may be an artifact of the data, because only 13 of Galloni's wine evaluations before June 2012 could be compared to those of Decanter (and so the estimate of 0.32 may be subject to great variation).

What has happened here? Barring errors in the data or analyses provided by Cardebat & Paroissien, it seems quite difficult to explain what has happened here. Mind you, I have shown repeatedly that the wine-quality scores provided by Jancis Robinson are usually at variance with those of most other critics (see Poor correlation among critics' quality scores; and How large is between-critic variation in quality scores?), but this particular example does seem to be extreme.

For the Cardebat & Paroissien analyses, both Jancis Robinson and Antonio Galloni have the lowest average correlations with all of the other critics, with 0.46 and 0.45, respectively, compared to a range of 0.58-0.68 for the others. So, in this dataset there is a general disagreement between these two people and the other critics, and also a strong disagreement with each other (correlation = 0.17). It is thus not something that is unique to Decanter, but it is interesting that the situation changed so dramatically when Decanter swapped scoring schemes.

References

Jean-Marie Cardebat, Emmanuel Paroissien (2015) Reducing quality uncertainty for Bordeaux en primeur wines: a uniform wine score. American Association of Wine Economists Working Paper No. 180.

Wendy V. Parr, James A. Green, K. Geoffrey White (2006) Wine judging, context and New Zealand sauvignon blanc. Revue Européenne de Psychologie Appliquée 56:231-238.

7 comments:

  1. "We do expect that a 20-point scale and a 100-point scale should be inter-changeable in some simple way, when assessing wine quality. However, there is actually no intrinsic reason why this should be so. ..."

    Those who use the UC Davis 20-point scale are obliged to address how a wine scores against a checklist of "components."

    "The Davis system is quite straightforward. It assigns a certain number of points to each of ten categories [components] which are then totaled to obtain the overall rating score for a given wine:

    17 - 20 Wines of outstanding characteristics having no defects
    13 - 16 Standard wines with neither oustanding character or defect
    9 - 12 Wines of commercial acceptability with noticeable defects
    5 - 8 Wines below commercial acceptability
    1 - 5 Completely spoiled wines

    •Appearance (2 points)
    •Color (2 points)
    •Aroma and Bouquet (4 points)
    •Volatile Acidity (2 points)
    •Total Acidity (2 points)
    •Sweetness/sugar (1 point)
    •Body (1 point)
    •Flavor (1 point)
    •Astringency (2 points)
    •General Quality (2 points)"

    Source: http://finias.com/wine/ucd_scoring.htm

    By contrast, Wine Spectator has no such 100 point wine scale based on "components":

    "there really is no [100 point wine] score sheet. ...

    Gloria Maroti Frazee
    director of education -- and video
    Wine Spectator

    (posted May 24, 2006)"

    Source: http://forums.winespectator.com/eve/forums/a/tpc/f/456102303/m/901103173

    Elaborating . . .

    "Grading Procedure

    In Wine Spectator, wines are always rated on a scale of 100. I assume you assign values to certain properties [components] of the wines (aftertaste, tannins for reds, acidity for whites, etc), and combined they form a total score of 100. An article in Wine Spectator describing your tasting and scoring procedure would be helpful to all of us.

    (Signed)

    Thierry Marc Carriou
    Morgantown, N.Y.

    Editor’s note: In brief, our editors do not assign specific values to certain properties [components] of a wine when we score it. We grade it for overall quality as a professor grades an essay test. We look, smell and taste for many different attributes and flaws, then we assign a score based on how much we like the wine overall."

    Source: Wine Spectator "Letters" section, March 15, 1994, Page 90)

    Note well: you cannot simply multiply a UC Davis 20-point scale score by five to arrive at a comparable Wine Spectator 100-point scale score.

    ReplyDelete
    Replies
    1. Okay, so that no one accuses me of being innumerate.

      Cognitive dissonance set in when I did the mental math re-reading this post titled "Davis Scoring System":

      http://finias.com/wine/ucd_scoring.htm

      Add up the values:

      •Appearance (2 points)
      •Color (2 points)
      •Aroma and Bouquet (4 points)
      •Volatile Acidity (2 points)
      •Total Acidity (2 points)
      •Sweetness/sugar (1 point)
      •Body (1 point)
      •Flavor (1 point)
      •Astringency (2 points)
      •General Quality (2 points)

      They DON'T total 20 points.

      Nineteen is the sum.

      Hmmm . . . let me troll the Web for a second, better source.

      Delete
  2. An explanation by Jancis Robinson predating her conversion to the 100-point scale . . .

    From Jancis Robinson, MW Website
    (circa 2002):

    “How to Score Wine”

    Source: http://www.jancisrobinson.com/articles/how-to-score-wine?layout=pdf

    I would be much happier in my professional life if I were never required to assign a score to a wine.

    . . .

    Even I have to admit, however, that scores have their uses. . . . however much we professionals may feel our beloved liquid is too subtle to be reduced to a single number.

    I find myself using all sorts of different scoring systems depending on the circumstances. . . .

    In most of my tasting and writing I don't really need scores. . . .

    I like the five-star system used by Michael Broadbent and Decanter magazine. Wines that taste wonderful now get five stars. Those that will be great may be given three stars with two in brackets for their potential. . . .

    I know that Americans are used to points out of 100 from their school system so that now they, and an increasing number of wine drinkers around the world, use points out of 100 to assess wines. Like many Brits, I find this system difficult to cope with, having no cultural reference for it.

    [Note her discomfort with the 100-point scale. ~~ Bob]

    So, I limp along with points and half-points out of 20, which means that the great majority of wines (though by no means all) are scored somewhere between 15 and 18.5, which admittedly gives me only eight possible scores for non-exceptional wines -- an improvement on the five star system but not much of one. (I try when tasting young wines to give a likely period when the wine will be drinking best, so I do cover the aspect of its potential for development.)

    ReplyDelete
  3. One would "think" that finding a reproducible version of the UC Davis 20-point scoring score would be easy on the Web.

    Not so . . .

    Found here at Wines.com
    (February 18, 2011):

    “UC Davis scoring system”

    Link: http://www.wines.com/wiki/uc-davis-scoring-system/

    "The Davis system was developed by Dr. Maynard A. Amerine, Professor of Enology at the University of California at Davis, and his staff in 1959 as a method of rating the large number of experimental wines that were being produced at the university.

    The Davis system is quite straightforward. It assigns a certain number of points to each of ten categories which are then totaled to obtain the overall rating score for a given wine.

    "17 – 20 Wines of outstanding characteristics having no defects
    13 – 16 Standard wines with neither oustanding character or defect
    9 – 12 Wines of commercial acceptability with noticeable defects
    5 – 8 Wines below commercial acceptability
    1 – 5 Completely spoiled wines

    Appearance (2 points)
    Color (2 points)
    Aroma and Bouquet (4 points)
    Volatile Acidity (2 points)
    Total Acidity (2 points)
    Sweetness/sugar (1 point)
    Body (1 point)
    Flavor (1 point)
    Astringency (2 points)
    General Quality (2 points)"

    (Subtext: “[General quality is] The only category for SUBJECTIVE appraisal, adjusting the score on the basis of the wine’s total performance.”) [CAPITALIZATION used for emphasis. ~~ Bob]

    Guess what? The numbers don't add up to 20. Rather, nineteen. (Again.)

    Still searching . . .

    ReplyDelete
  4. In a different section of Wines.com we find this text likewise dated February 18, 2011 .

    "taste scoring systems"

    Link: http://www.wines.com/wiki/taste-scoring-systems/

    "In the UC Davis 20 point system, points are given for the following categories: Appearance (2), Color (2), Aroma & Bouquet (4), Volatile Acidity (2), Total Acidity (2), Sugar (1), Body (1), Flavor (1), Astringency (1), and General Quality (2)."

    Add up the numbers. They total 18. Not twenty.

    Still searching . . .

    ReplyDelete
  5. From Wine Business Monthly:
    (November 2005 Issue):

    “A Better Wine Scorecard?;
    Napa Valley College's new wine scoring system objectively analyzes wine while also allowing for relevant notes on wine style, character, aging, cost and where the wine can be purchased.”

    Link: https://www.winebusiness.com/wbm/index.cfm?go=getArticle&dataId=41491

    By George Vierra

    UC Davis 20-Point Scorecard

    APPEARANCE (2 points)

    COLOR (2 points)

    AROMA and BOUQUET (4 points)

    VOLATILE ACIDITY (2 points)

    TOTAL ACIDITY (2 points)

    SWEETNESS (1 points)

    BODY (1 point)

    FLAVOR (2 points)

    BITTER/ASTRINGENT (2 points)

    GENERAL QUALITY (2 points)

    TOTAL RANKING (20 points)

    ReplyDelete
  6. Update:

    David has apprised me via e-mail that the 1976 version of the UC Davis scale revising the original 1959 version drops Volatile Acidity and adds 2 points to Aroma & Bouquet, and splits the 2 points for Bitterness into 1 point each for Bitterness and for Astringency.

    So reworking the numbers to arrive at the "modified" UC Davis scale:

    APPEARANCE (2 points)

    COLOR (2 points)

    AROMA and BOUQUET (6 points)

    VOLATILE ACIDITY (0 points -- deleted)

    TOTAL ACIDITY (2 points)

    SWEETNESS (1 point)

    BODY (1 point)

    FLAVOR (2 points)

    BITTERNESS (1 point)

    ASTRINGENCY (1 point)

    GENERAL QUALITY (2 points)

    TOTAL RANKING (20 points)

    ("Opine on wine": it shouldn't be this difficult to find the current UC Davis scale -- somewhere -- at the university's department of viticulture and enology program website, or on the Web.)

    ReplyDelete