Monday, December 4, 2017

California cabernets do not get the same quality scores at different tastings

We are commonly told that most wines are drunk within a very few days of purchase. On the other hand, it is a commonly held belief among connoisseurs that many wines are likely to improve with a bit of bottle age, especially red wines. Counter-balancing the latter idea, it is the easy to demonstrate that the perception of wine quality depends on the people concerned and the circumstances under which the wines are tasted.

Therefore, I thought that it might be interesting to look at some repeated tastings of the same wines under circumstances where they are formally evaluated under roughly the same conditions. Do these wines get the same quality scores at the different tastings?

To look at this, I will use some of the data from the tastings of the Vintners Club, based in San Francisco. The results of the early club tastings are reported in the book Vintners Club: Fourteen Years of Wine Tastings 1973-1987 (edited by Mary-Ellen McNeil-Draper. 1988); and I have used this valuable resource in several previous blog posts (eg. Should we assess wines by quality points or rank order of preference?).

For each wine tasted, the book provides the average of the UCDavis points (out of 20) assigned by the group of tasters present at that meeting. The Vintners Club has "always kept to the Davis point system" for its tastings and, therefore, averaging these scores is mathematically valid, as is comparing them across tastings. Many of the wines were tasted at more than one meeting, although each meeting had its own theme. For out purposes here, the largest dataset is provided by the tastings involving California cabernet wines, which I will therefore use for my exploration.

In the book, there were 170 different California cabernets that the Club tasted more than once, sometimes up to five years apart. Most of these were tasted only twice, but some were tasted up to four times over four years. Of these 170 wines, only eight wines produced the same average score on their first two tasting occasions. Of the rest, 63 wines (37%) produced a lower score on the second occasion, and 99 (58%) produced a higher average score. Perhaps we might tentatively conclude that California cabernet wines do tend to increase in quality with a bit of time in bottle?

However, it is instructive to look at those 137 wines that were re-tasted within one year of their first tasting. This will give us some idea of the repeatability of wine quality scores, as we should not really be expecting California cabernet wines to change too much within their first year in the bottle. Any differences in scores are therefore likely to reflect the tasting group rather than the wine itself (unless there is much bottle variation, or the wines mature very rapidly).

The data are shown in the first graph, with the difference between the two tasting dates shown horizontally, and the difference between the two average UCDavis scores shown vertically (ie. second tasting score minus first tasting score). Each point represents one wine, tasted twice within a year.

Difference in quality scores for wines re-tasted within 1 year

Clearly, there is considerable variability in the quality scores between tastings (the scores are spread out vertically over several quality points). Moreover, there is not much pattern to this variability — even after only a few days the scores can differ by more than 1 point; and even after a year they can still be identical. Most of the wines (70%) produced scores within +/– 1 point at the two tastings.

Notably, however, there were more decreases in score than there were increases between the two tastings. Only eight wines produced an increase in score of more than 1 point, while 33 wines (24%) produced a decrease in score of more than 1 point, and four of these actually decreased by more than 2 points. (NB: some of the points on the graph sit on top of each other.) I was not expecting to see such a strong pattern.

A so-called Difference/Average plot can sometimes be informative, and this is shown in the next graph. This shows the same data as above, but this time the horizontal axis represents the average of the two quality scores for each wine (rather than representing time).

Quality scores for wines re-tasted within 1 year

This graph does not reveal much in the way of tell-tale patterns, which is unusual. However, we might anticipate that high-scoring wines will get more consistent quality scores, and this appears to be so for those few wines scoring >16 points. Furthermore, wines scoring <14 points do not do well at their second tasting.

Finally, we can look at those six California cabernets that were each tasted on four separate occasions. The final graph shows the time-course( horizontally) of their score (vertically), with each wine represented by a single line (as labeled).

Quality scores of wines tasted four times

Note that only one wine (from Stag's Leap) consistently increased in assessed quality over the years, while two other wines (from Beaulieu, and Robert Mondavi) consistently decreased. The remaining three wines had more erratic patterns. These differences may simply reflect random variation, and so we shouldn't read too much into this small sample size. Nevertheless, we do not see the hoped-for general increase in assessed quality of California cabernets over their first few years in bottle.

So, do California cabernets get the same quality scores at different tastings? In general, yes. However, a large number of them end up with notably lower scores if there is a second tasting within a year.


  1. Dear Mr. Morrison,

    With all due respect, you should know that the following statement from your wonderful paper is not entirely correct, “as we should not really be expecting California cabernet wines to change too much within their first year in the bottle.”

    We were new to winemaking and our first vintage of Napa Beckstoffer George 3rd Cabernet Sauvignon (2009) was “huge” in the barrel. We were excited to taste it from the bottle after 18 months in 100% new French oak “medium plus” charred barrels and then 1 of 4 barrels racked into another new barrel. About a week or so after bottling we opened our first bottle, prepared to make a great toast.
    We were devastated. My wife and I looked at each other with wide eyes. Our baby was now lifeless and had lost its huge midpallet “fullness” and richness. The layers of flavors were also gone.

    I called our custom crush plant’s general manager and unhappily asked what had happened in the bottling process to destroy our wine. He laughed at me and said, “Big California Cabs and Pinots get bottle shock when they go through the filtering and fining process immediately prior to bottling. The bigger they are the more they are impacted. Over time the flavors you tasted prior to bottling will come back.”

    We tasted a bottle each month afterwards for the next year or so. We were amazed how much it changed from month to month. Each month the wine was “bigger”. After over 18 months in the bottle we submitted our 2010 Beckstoffer to Wine Enthusiast, from which the wine received a 95-rating.

    We have since discovered that our Chardonnays do not experience as much bottle shock and recover much more quickly. In our experience Pinot Noir wines are not as impacted, although more so than Chardonnay.

    So, depending upon the style of the Cabernet Sauvignon wines in the study, it’s likely that they actually did change significantly over the course of their first year in the bottle.

    I’m sure that there are wine scientists who can talk about the esthers being muted somehow. Here’s a quote from Wikipedia: “Bottle-shock or Bottle-sickness is a temporary condition of wine characterized by muted or disjointed fruit flavors. It often occurs immediately after bottling or when wines (usually fragile wines) are given an additional dose of sulfur (in the form of sulfur dioxide or sulfite solution). After a few weeks, the condition usually disappears.[1][2][3]”

    We look back on that naive temporarily-devastating moment and have a good laugh at ourselves. We do not release our Beckstoffer wines until they have been in the bottle for 18 months. Our Sta. Rita Hills Pinot Noir is released after 12 months. For what it is worth, we have learned that a year in the barrel is sufficient for both the Cabs and Pinots, but you need that respective time in the bottle to get the wine close to its potential.

    Skip Coo[****]

    ***This comment was meant to be educational to the author and his readers, not as a publicity piece for our wine, of which we produce negligible amounts.

    1. Dear Skip

      Thanks for your detailed comments, which I certainly take in the spirit in which they were given.

      Tasting wines through time is fun as well as educational. It is a pity that so few wine lovers get to regularly try wines during the early part of their development.

      Both the data presented here, and that in Cellar Tracker, seem to indicate either large bottle variation or relatively rapid changes through time. You are highlighting the latter. Sadly, this seems to make professional scores, which are usually based on wines tasted once only, shortly after release, rather uninformative.



  2. Excellent article. The Vintners Club is the best tasting group I know of. I wish they had included the Ridge Montebello in chart because it is the longest lived American wine I know (see Jancis Robinson's Vintage Timecharts which you can buy CHEAP on Amazon).

    I just published my latest blog ( Monday on The Judgment of Paris Revisited, It is interesting that the Vintners Club did the first tasting after and subsequent tastings all showed declines in both Bordeaux and Burgundies.
    I made up tables showing everything you need to know that is available and you may be surprised.

    1. I have already published a blog post on the data from the Judgment of Paris:
      I have also discussed why the data for the white wines are missing:
      I have one more post planned where I will show exactly why, in this case, it is mathematical nonsense to pool the scores from the individual tasters, and try to create a rank ordering.