Monday, August 6, 2018

Why not expand the 100-point scale?

Value judgments are usually presented on some sort of quantitative scale, with an upper limit of maybe 5 stars or 10 points, or even 100 points. In most cases, the maximum value represents the best quality that the evaluators expect to see.* This leads to a potential problem when someone or something achieves that quality. What happens next, now that we know the maximum can be achieved? What do we do when someone does even better?


For example, at the 1984 Winter Olympics the figure-skating pair of Jayne Torvill and Christopher Dean received maximum artistic-impression scores of 6.0 from each of the 12 judges, which had never happened before (for a single performance). Does this mean that no-one can ever do better? Not unexpectedly, the International Skating Union's International Judging System eventually replaced the previous 6.0 system (in 2004), so that scores no longer get near the maximum possible.

In a similar vein, it has been pointed out innumerable times that the top end of the 100-point wine-quality scale has become unnaturally crowded. This graph of the frequency distribution of some of Robert Parker's wines scores illustrates the issue (taken from my post Biases in wine quality scores). Here, the height of each vertical bar in the graph represents the proportion of wines receiving each score, as shown horizontally.


There is a distinct bump in the graph at a score of 100, indicating that more wines are being awarded this score than would be expected. This is precisely what happens when we reach the ceiling of any quality scale — there are lots of very good wines, and we cannot distinguish among them because we have to give them all the same score: 100.

We probably need to address this issue. Given the large subjective component in such ratings, there are only two general ways to go about this. We either:
  1. re-scale the 100-point scale, thus reducing the quality implication of the scores, so that "100-point wines" no longer get 100 points but instead get a wider range of lower points; or 
  2. go past the 100 limit, and start doling out scores that exceed 100 points.
This raises the question of whether the latter option has ever been chosen. Indeed, it has happened at least once that I know of (and there may be others).

In September 1998, Jancis Robinson posted on her web site a set of quality scores from a vertical tasting of the wines of Château d'Yquem (Notes from attending an Yquem vertical tasting).** The data are shown in the next graph, with the quality scores vertically and the wine vintages horizontally. The first two vintages were the "Thomas Jefferson wines" supplied by Hardy Rodenstock, and so their provenance is considered doubtful.

Jancis Robinson's wine-quallity scores for Château d'Yquem

The quality of the remaining wines is nominally scored on Robinson's usual 20-point scale. Note that three of the wines received a score of 20, while four of them were awarded scores that notably exceed 20 points (marked by the red line). Robinson made no comment about her unexpected scores, but she did use a series of superlatives in her tasting notes, the like of which we do not usually see from her pen (eg. "absolutely extraordinary").

Obviously, Robinson has her own personal quality scale, and what we are presumably being told here is that these wines exceed her usual expectations for a "20-point wine". It therefore seems to me that this is a prime example of option (2) presented above.

As such, the question does now rise as to whether this approach was actually necessary in this particular case. We might find a possible answer by looking at what other people have done when confronted with these same wines.

As one example, Per-Henrik Mansson published a set of quality scores for many of the same wines in the May 1999 issue of the Wine Spectator magazine (Three centuries of Château d'Yquem). He used a 100-point scale for his scores, so I have converted them to a 20-point scale for the comparison shown in the next graph (Mansson's relevant scores are in maroon).

Comparison of scores from Jancis Robinson and Per-Henrik Mansson

The correlation between the two sets of scores is 48%, which is slightly higher than we have come to expect from wine professionals (10-40%). However, Mansson never exceeded the nominal limit of his scale — of the 121 scores in his article, there are four 100-point scores, but none scored higher. Indeed, a comparison of the scores on the 20-point scale shows that Robinson's scores are generally 25% higher than Mansson's, across the board.

I think that we might therefore argue that Mansson has provided an example of option (1) presented above (ie. re-structuring the scale so that we don't bump our head against the score ceiling). Actually, Mansson provided nine scores that are <70 and 30 scores that are <80, so that he used a large part of the score range from 50-100 points (his lowest score is 55). This wide range of scores would be considered very unusual during the 20 years since he published his scores!

As a final note, there are only two vintages for which Robinson and Mansson strongly disagree — Robinson scored the 1931 vintage much higher than did Mansson, and he returned the favor with the 1971 vintage.



* This was not actually true at the undergraduate university I attended. The final (research) year of my science degree was assessed on a scale of 1-20. In this case, 20 points represented perfection, which could not be obtained in practice by anyone, let alone a student. Nor could a student get 18 or 19 points, although these might be obtained by a professional scientist. The best that might be expected for a student was 16 points, in which case the student was awarded the University Medal, which happened only occasionally. The top mark that might regularly be expected (ie. every year) was 14 points. At the other end, 0 points was a fail at the Honours year, which meant that the student would get a Pass award, instead.

** Thanks to Bob Henry for providing a copy of the blog post.

13 comments:

  1. With David’s indulgence, let me quote in multiple sequential comments Jancis Robinson MW, who discarded her own 20-point scale and REINVENTED in the moment an unprecedented 26 point scale when savoring ethereal older bottles of Yquem.



    PART ONE:

    From Jancis Robinson, Master of Wine Web Site
    (Posted September 1998):

    “Notes From Attending an Yquem Vertical Tasting”

    Ms. Robinson’s prefacing remarks: “All in bottles with original cork unless stated otherwise.”

    1784
    (President Thomas Jefferson’s collection bottle, 1 tasted by Michael Broadbent, H R [host Hardy Rodenstock] and German cronies at Wiesbaden in 1985 soon after H R’s acquisition of this Thomas Jefferson collection from a mysterious “bricked up cellar in Paris” before another was auctioned at Christie’s in 1986. The bottle tasted in 1998 was much darker than that described in 1985.)

    Very dark brown syrup with copper coloured rim. Bottle stink immediately after pouring. After 5-10 minutes a very beguiling bouquet of dried roses emerged and the wine was lively, aromatic, fragrant for a good 40 to 50 minutes. On the palate the wine was very gentle, very delicate, very feminine to the 1787’s more aggressive appeal, and the sweet fruit was lovely and very, very long before fading (earlier than the 1787). A marvel of a relic rather than unmitigated pleasure.

    1787
    (another dark Thomas Jefferson bottle, engraved not labelled, with a deep punt).
    Deep, deep brown with a greenish rim and, like the 1784, smelt slighty mouldy at first. There was definite life here, however, in a wine that was slightly treacly, extremely lively with marked but not unpeasant acidity. On the palate a burnt sugar start, dry finish, no great persistence. After 40 minutes there was an intense nose of chestnuts, autumnal and briary. More robust and concentrated but less charming than the 1784. Powerful, chunky.

    1811
    (the year of the comet)
    A quite amazing wine, served blind with 1831, 1911 and 1931 it was the most intense, yet least evolved of the lot.

    Deep amber with green gold rim. So vibrant and multilayered on the nose, it smelt as though it was just starting to unfold, yet was utterly convincing about the treasures it had yet to give up. Spicy and rich and so, so piercingly clean. Racy, long piercing essence of cream and spice. Very, very powerful, long and complete. After 40 minutes in the glass it took on a hint of rum toffees which is not a flavour I happen to like (c.f. the greater delicacy of the 1847) but that is the only criticism I could possibly muster. This is presumably a one-off and probably deserves an even higher ranking than the 1847. 25 points [Bob Henry comment: her 25 point score exceeds her own 20 point scale.] and still a great deal to give. I hope very much to have a chance to taste it again before I die.

    ReplyDelete
  2. PART TWO:

    From Jancis Robinson, Master of Wine Web Site
    (Posted September 1998):

    “Notes From Attending an Yquem Vertical Tasting”

    Ms. Robinson’s prefacing remarks: “All in bottles with original cork unless stated otherwise.”

    1828
    Deep amber.
    Nose not quite knit, slightly volatile. Dry finish. Less intense than the other wine vaguely in this style, the 1899 (as well it might be). 18 points and going downhill slowly.

    1831
    Wide, pale rim with a heart of deep amber. Very very intense yet subtle nose with nots of nuts and cream. A superb wine with layers and layers of flavour and richness. Angelo Gaja suggested baby powder and roasted hazelnuts. Wonderfully smooth texture. Its effect on this jaded palate was medicinal in the best possible way: a quite delicious pick-me-up. So long, yet delicate. A great, great wine that happened to be served with one or two even greater ones. 24 points [sic] and probably at its peak.

    1847
    The big issue of the day was whether this of the 1811 was ‘better’. (Both were absolutely extraordinary. The 1847 gave me more pure tasting pleasure, but apparently this wonderfully pure scent of raspberries and vanilla cream had been apparent on the 1858 and the 1869 tasted previously, whereas there is nothing quite like the 1811 for intensity and youthfulness.) Relatively light tawny-amber. Extraordinary nose, at first perfectly ripe, warm raspberries and then heady vanilla cream. Beautifully balanced. Gentle. Delicate. Perfect texture. Nothing could be finer. 26 points [sic] and probably still climbing, although the 1811 will outlast it.

    1861
    Extraordinary in every way. Looked almost like black syrup, a PX, with gamboge rime. Smelt of treacle toffee and tea and moved like a thick treacle too. Very very sweet and concentrated. Certainly not fine but, amazingly, well balanced. A one-off. 23 points [sic] and nearly at its peak.

    1893 (recorked 1996)
    Very very deep brownish mahogany; looks thick and treacly. Correct nose of sturdy deep richness. Intense flavour of a much more conventionally massive build than the 1899. Lots of ripeness and length and potential. 19 points and still a long way to go.

    1899 (recorked 1994)
    Layered mahogany. No nose to begin with but delicate and somehow convincing. Lovely dancing delicate texture on the palate. Great sweetness counterbalanced by acidity. Not one of the pinnacles of this tasting but a gorgeous and extremely satisfying wine. 19 points and still improving.

    1900 (recorked 1900)
    Fox red of only medium intensity and a yellow-green rim. Sweet and heady with a slight hint of estufa on the nose. Light in weight and sweetness with a slightly dry end. 16 points and fading.

    1911 (recorked 1996)
    Hint of dark brown (as opposed to rich mahogany) in slightly lacklustre hue. Initially slightly mouldy but underneath a gorgeous bouquet of steeped raisins. Very, very sweet at first with notable acid at the end of the palate. Slightly spindly c.f. the 1811 and 1831 it was served with. NB recorking. 19 points and ready.

    ReplyDelete
  3. PART THREE:

    From Jancis Robinson, Master of Wine Web Site
    (Posted September 1998):

    “Notes From Attending an Yquem Vertical Tasting”

    Ms. Robinson’s prefacing remarks: “All in bottles with original cork unless stated otherwise.”

    1931 (recorked 1997)
    Very clear, pale amber. Pure, clean, sharp but not especially intense nose. Quite lean and light with almost madeira-like acidity. With its less-than-usual charge of sweetness and exceptionally palate-rinsing-like crispness, this was the only wine that might have been difficult to recognise immediately as great Sauternes. It could almost have been a very old, light fortified wine. 17 points and on the way down

    1945
    Very very deep mahogany, extremely viscous. Yellow/green rim. Essence of rose petals on nose with something almost suggestive of oak. Rich. complex nose. Very intense flavour, extremely sweet – fuller and rounder than 1947 or 1949. Perfect texture, balancing acidity, and so much more than just sweet. 20 points and ready.

    1947
    As deep a mahogany as 1945 with similar development at rim. Smells creamy with hint of something vegetal and a floral topnote. Not as overwhelming sweet as either 1945 or 1949 but extremely youthful, lively and crisp. Could be great with nuts; less so with anything very sweet. Those who know the wine better than me were slightly disappointed by this bottle. 18 points and still considerable evolution to come.

    1949
    Deep tawny/amber with pale yellow rim. Scent of raisins, not as subtle a nose as the 1945 or 1947 with only medium intensity but.. on the palate a great thwack of purest raisin cream with great length of flavour. 19 points and not yet at peak.

    1950
    Looks much less viscous and much paler than the three vintages above; deep gold with some amber highlights. Relatively lightweight on the nose but definite creme brulee. A hint of something not 100 per cent clean about this bottle. Creamy and sweet on the palate, very refreshing, long, could give enormous pleasure served in isolation; next to the heavyweights of the 1940s it looked very slightly lean. 17 points and still evolving.

    1958
    Lively, deep orange and tawny. Both grass and sweetness on the very intense nose. Very compete palate. Extremely long and complex with many reverberations. 18 points and still climbing.

    ReplyDelete
  4. PART FOUR:

    From Jancis Robinson, Master of Wine Web Site
    (Posted September 1998):

    “Notes From Attending an Yquem Vertical Tasting”

    Ms. Robinson’s prefacing remarks: “All in bottles with original cork unless stated otherwise.”

    1960
    Deep tawny with brown notes. Intense nose with strong floral notes on Christmas pudding
    flavours. Full, round, rich, long but slightly brawny and drying out at the end. Not fine, rather aggressive and old, hint of maderisation. 17 points and going downhill. [Bob Henry’s comment: 17 points is a rather high score for a “Not fine . . . maderi[zed]” wine.]

    1968
    Deep tawny marmelade colour. Very slightly mousey to begin with on the nose. Palate very rich and extremely long, but a bit of dryness at the end. Not complete; a bit jagged. 16 points and probably near its peak.

    1969
    Lively colour of a ginger cat. Looks more like an Australian stickie than an Yquem. Smelt of ginger Edinburgh rock. Very unsubtle. 13 points; can’t imagine evolution.

    1971
    Deep butterscotch colour. Rich creme brulee scent. Very very full flavoured, quite brutal impact on the palate. This wine could become something splendid but for the moment is about 16 points.

    1973
    Pale tawny. Relatively simple, sugary nose. Lots of unresolved acidity on the palate. The wine may well improve in bottle but is an awkward. 14 points at the moment.

    1983 (imperiale)
    Deep apricot colour. Exotic nose of dried tropical fruit – mango? Gorgeous, full bodied, delightfully middle aged, between youthful and embryonic and fully blown. Long and powerful though a a drying hint of dried apple peel on the finish. 20 points and still a long way to go.

    1988 (double magnum)
    Very pale straw. Lovely pure botrytis notes. Youthful reminder of quite what a transformation bottle age is. Still relatively simple but very pleasurable. Sweet, uncomplicated, beguiling. 18.5 points with decades to go.

    1990 (magnum)
    Light youthful gold. Peachy smell redolent of botrytis. Also a hint of fine polished wood on the nose. Very long, firm, sleek, confident. Big and rich. 20 points and a long long way to go.

    1991
    Paler than 1990. Pale gold. Smells of very ripe pears. relatively simple and unevolved. Clean quite tart palate. Noticeably lighter bodied than 1990 — not as pure and tingling either. 17 points and developing but not a great Yquem.

    ReplyDelete
  5. It is my understanding that Jancis Robinson MW has revised downward her above 20-point Yquem scores to a maximum of 20-points.

    I leave to others to locate that revised article and its URL.

    ReplyDelete
  6. Let’s jump into our H.G. Wells time travel machine back to 1989.

    Robert Parker is being interviewed by Wine Times magazine.

    [Sorry, no link. Wine Times never made it into the digital age. And apparently no one “archived” its contents for posterity. One more example that debunks the belief that “everything is on the Internet.”]

    WINE TIMES: How is your scoring system different from The Wine Spectator’s?

    PARKER: Theirs is really a different animal than mine, though if someone just looks at both of them, they are, quote, two 100-point systems. Theirs, in fact, is advertised as a 100-point system; mine from the very beginning is a 50-point system. If you start at 50 and go to 100, it is clear it’s a 50-point system, and it has always been clear. Mine is basically two 20-point systems with a 10-point cushion on top for wines that have the ability to age. . . .

    I thought one of the jokes of the 20-point systems is that everyone uses half points, so it’s really a 40-point system — which no one will acknowledge — and mine is a 50-point system, and in most cases a 40-point system.

    WINE TIMES: But how do you split the hairs between an 81 and an 83?

    PARKER: It’s a fairly methodical system. The wine gets up to 5 points on color, up to 15 on bouquet and aroma, and up to 20 points on flavor, harmony and length. And that gets you 40 points right there. And then the [balance of] 10 points are . . . simply awarded to wines that have the ability to improve in the bottle. This is sort of arbitrary and gets me into trouble.

    . . .

    WINE TIMES: Do you have a bias toward red wines? Why aren’t white wines getting as many scores in the upper 90s? Is it you or is it the wine?

    PARKER: Because of that 10-point cushion. Points are assigned to the overall quality but also to the potential period of time that wine can provide pleasure. And white Burgundies today have a lifespan of, at most, a decade with rare exceptions. Most top red wines can last 15 years and most top Bordeaux can last 20, 25 years. It’s a sign of the system that a great 1985 Morgon [BEAUJOLAIS] is not going to get 100 points because it’s not fair to the reader to equate a BEAUJOLAIS with a 1982 Mouton-Rothschild. You only have three or four years to drink the BEAUJOLAIS.

    WINE TIMES: In your system, what would be the highest rated BEAUJOLAIS?

    PARKER: 90. That would be a perfect BEAUJOLAIS, and I’ve never given one. I have given a lot of 87s and 88s.

    [Bob Henry’s comment: In 1990, Parker awarded a score of 92 points to the 1989 vintage Georges Duboeuf “Jean Descombes” Morgon BEAUJOLAIS, contradicting his then year-old statement above.

    Fast forward to 2011: the stellar 2009 vintage cru BEAUJOLAIS garnered scores in the 91 to 94 point range from Wine Advocate.

    So what's more perfect than "perfect"?]

    WINE TIMES: So it’s the aging potential that is the key factor that gets a wine into the 90s.

    PARKER: Yes. And it goes back to how I evaluate vintages in general. To me the greatness of a vintage is assessed two ways: 1) its ability to provide great pleasure — wine provides, above all, pleasure; 2) the time period over which it can provide that pleasure. . . .

    [CAPITALIZATION added for emphasis. ~~ Bob]

    ReplyDelete
  7. Quoting Robert Parker from a 2002 issue of The Wine Advocate:

    “. . . Readers often wonder what a 100-point score means, and the best answer is that it is pure emotion that makes me give a wine 100 instead of 96, 97, 98 or 99.”

    That is pretty capricious.

    ReplyDelete
  8. The challenge for every wine critic is assessing "How high is high?"

    If s/he has never tasted a truly transcendent, life altering wine from a specific grape variety . . . then what is "perfection" (if perfection is noted by a 20 point or 100 point score)?

    I have written in a comment before on David's blog that many highly experienced wine tasters consider the 1947 Cheval Blanc to be the best red wine ever made from Bordeaux.

    Perhaps the best red wine ever made. Possibly even the best wine red or white or sparkling made -- period.

    (See Mike Steinberger's February 13, 2008 posted Slate column titled "The Greatest Wine on the Planet: How the 1947 Cheval Blanc, a defective wine from an aberrant year, got so good."

    URL: http://www.slate.com/articles/life/drink/2008/02/the_greatest_wine_on_the_planet.single.html)

    If that assessment arrived at by consensus is correct, then all other wines and their scores need to be calibrated against that reference standard.

    And if correct it begs the question: if you as a wine critic have never tasted the 1947 Cheval Blanc, then you cannot assign a wine a 100 point score . . . because your proverbial measuring yard stick comes up less than 36 inches. [*]

    [*Apologies to non-Americans who embrace the metric system.]

    ReplyDelete
  9. A postscript.

    Let me quote Steinberger from his book titled "The Wine Savant: A Guide to the New Wine Culture" (W. W. Norton & Company © 2013).

    Chapter Heading: “Bucket List Wines”

    "The 1947 Cheval [Blanc] is probably the most celebrated wine of the twentieth century. It is the wine every grape nut wants to experience, a wine that even the most jaded aficionados will travel thousands of miles to taste. A few years ago I wrote an article for Slate about the ’47 Cheval, a piece that culminated with my one and only taste of this fabled Bordeaux. I went to Geneva, Switzerland, to try the Immortal One, and it was well worth the journey. The wine was simply amazing. The moment I lifted the glass to my nose and took in that sweet, spicy, arresting perfume, my notion of excellence in wine and my understanding of what wine was capable of were instantly transformed – I COULD ALMOST HEAR THE [SCORING] SCALES RECALIBRATING IN MY HEAD. The ’47 was the warmest, richest, most decadent wine that I’d ever encountered. Even more striking than its opulence was its freshness. The flavors were redolent of stewed fruits and dead flowers, yet the wine tasted alive; it bristled with energy and purpose. It was a sensational experience . . ."

    [CAPITALIZATION used for emphasis. ~~ Bob]

    ReplyDelete
  10. David writes:

    "There is a distinct bump in the graph at a score of 100, indicating that more wines are being awarded this score than would be expected. This is precisely what happens when we reach the ceiling of any quality scale — there are lots of very good wines, and we cannot distinguish among them because we have to give them all the same score: 100."

    Courtesy of Wine Searcher: a list of wines awarded 100-point scores by Robert Parker.

    https://www.wine-searcher.com/robertparker.lml

    ReplyDelete
  11. "Perfection isn't perfect: Parker says only 50% of his 100-point scores are repeatable"
    The Gray Report (May 13, 2015)

    URL: https://blog.wblakegray.com/2015/05/perfection-isnt-perfect-parker-says.html

    ReplyDelete
  12. As The Pythons say: "And now for something completely different."

    Tim Hanni is one of the first two resident Americans to successfully complete the examination and earn the title Master of Wine.

    Moments ago I serendipitously came across his puckish post on the one hundred-point scale (OHPS).

    "The Guild of Wine Depreciators (GOWD):
    A humorous look at the only TRUE 100-point system!"

    URL: http://timhanni.homestead.com/Guild_of_Wine_Depreciators.pdf

    (Aside: We have a name for a wine that scores 50 points or less: "water.")

    ReplyDelete
  13. For those who suffer from Fear of Missing Out, let the chase after Parker or Wine Spectator or Vinous "points" begin.

    Time to lace up your track shoes:

    https://i.pinimg.com/474x/3c/e4/16/3ce4160a93d3e1a5b0efdeb9696a9c9a--gary-larson-short-stories.jpg

    ReplyDelete