Monday, 24 July 2017

Wine tastings: the winning wine is often the least-worst wine

At organized wine tastings, the participants often finish by putting the wines in some sort of consensus quality order, from the wine most-preferred by the tasting group to the least-preferred. This is especially true of wine competitions, of course, but trade and home tastings are often organized this way, as well.

The question is: how do we go about deciding upon a winning wine? Perhaps the simplest way is for each person to rank the wines, and then to find a consensus ranking for the group. This is not necessarily as straightforward as it might seem.


To illustrate this idea, I will look at some data involving two separate blind tastings, in late 1995, of California cabernets (including blends) from the 1992 vintage. The first tasting had 18 wines and 17 tasters, and the second had 16 wines and 16 tasters. In both cases the tasters were asked, at the end of the tasting, to put the wines in their order of preference (ie. a rank order, ties allowed).

The first tasting produced results with a clear "winner", no matter how this is defined. The first graph shows how many of the 17 tasters ranked each wine in first place (vertically) compared to how often that wine was ranked in the top three places (horizontally). Each point represents one of the 18 wines.


Clearly, 15 of the 18 wines appeared in the top 3 ranks at least once, so that only 3 of the wines did not particularly impress anybody. Moreover, 6 of the wines got ranked in first place by at least one of the tasters — that is, one-third of the wines stood out to at least someone. However, by consensus, one of the wines (from Screaming Eagle, as it turns out) stood out head and shoulders above the others, and can be declared the "winner".

However, this situation might be quite rare. Indeed, the second tasting seems to be more typical. The next graph shows how many of the 16 tasters ranked each wine in first place (vertically) compared to how often that wine was ranked in the top five places (horizontally). Each point represents one of the 16 wines.


In this case, the tasters' preferences are more evenly spread among the wines. For example, every wine was ranked in the top 3 at least once, and in the top 4 at least twice, so that each of the wines was deemed worthy of recognition by at least one person. Furthermore, 10 of the 16 wines got ranked in first place by at least one of the tasters — that is, nearly two-thirds of the wines stood out to at least someone.

One of these wines, the Silver Oak (Napa Valley) cabernet, looks like it could be the winner, since it was ranked first 3 times and in the top five 7 times. However, the Flora Springs (Rutherford Reserve) wine appeared in the top five 10 times, even though it was ranked first only 2 times; so it is also a contender. Indeed, if we take all of the 16 ranks into account (not just the top few) then the latter wine is actually the "winner", and is shown in pink in the graph. Its worst ranking was tenth, so that no-one disliked it, whereas the Silver Oak wine was ranked last by 2 of the tasters.

We can conclude from this that being ranked first by a lot of people will not necessarily make a wine the top-ranked wine of the evening. "Winning" the tasting seems to be more about being the least-worst wine! That is, winning is as much about not being last for any taster as it is about being first.

This situation is not necessarily unusual. For example, on my other blog I have discussed the 10-yearly movie polls conducted by Sight & Sound magazine. In the 2012 poll Alfred Hitchock's film Vertigo was ranked top, displacing Citizen Kane for the first time in the 50-year history of the polls; and yet, 77% of critics polled did not even list this film in their personal top 10. Nevertheless, more critics (23%) did put Vertigo on their top-10 list than did so for any other film, and so this gets Vertigo the top spot overall. From these data, we cannot conclude that Vertigo is "the best movie of all time", but merely that it is chosen more often than the other films (albeit by less than one-quarter of the people). Preferences at wine tastings seem to follow this same principle.

Finally, we can compare the seven wines that were common to the two tastings discussed above. Did these wines appear in the same rank order at both tastings?

In this case, we can calculate the consensus rank for each tasting by summing the ranks from each participant, giving 3 points for first rank, 2 points for second, and 1 point for third. The result of this calculation is shown in the third graph, where each point represents one of the seven wines, and the axes indicate the ranking for the two tastings.


The two groups of tasters agree on the bottom three wines in their rankings. However, they do not agree on the "winning" wine among these seven. More notably, they disagree quite strongly about the Silver Oak cabernet. In the second tasting this wine received 3 firsts and 2 thirds (from the 16 tasters), while in the first tasting it received 1 third ranking only (out of 17 people). The consensus ranking of this wine thus differs quite markedly between the tastings. This may reflect differences in the type of participants at the tastings, there being a broader range of wine expertise in the second tasting.

Monday, 17 July 2017

Yellow Tail and Casella Wines

Some weeks ago I posted a discussion of whether wine imports into the USA fit the proverbial "power law". I concluded that US wine imports in 2012, in terms of number of bottles sold, did, indeed, fit a Power Law. This included the best-selling imported wine, Yellow Tail, from Casella Wines, of Australia.

However, bottle sales are not the complete picture, since ultimately it is the dollar sales value that determines profitability. Statista reports (Sales of the leading table wine brands in the United States in 2016) that Yellow Tail US sales were worth $281 million in 2016, which ranks it at no. 5 overall, behind the domestic brands Barefoot ($662 million), Sutter Home ($358 million), Woodbridge ($333 million) and Franzia ($330 million). Moreover, in July 2016, The Drinks Business placed Yellow Tail at no. 6 in its list of the Top 10 biggest-selling wine brands in the world, based on sales in 2015.


It is interesting to evaluate just how profitable Yellow Tail has been for Casella Wines. This is a family-owned company founded in 1969 (see Casella Family Brands restructures to ensure family ownership), currently ranked fourth in Australia by total revenue but second by total wine production (see Winetitles Media). This makes the Casella family members seriously rich, and even in a "bad" year they are each paid millions of dollars by the company.

Being a registered company (ABN 96 060 745 315), the Casella Wines Pty Ltd accounts must be lodged with the Australian Securities and Investments Commission (the corporate regulator) at the end of each financial year (June 30). This next graph shows (in Australian $) the reported profit/loss for each financial year since the first US shipment of Yellow Tail in June 2001. (Note: the 2015-2016 financial report has apparently not yet been submitted.)

Casella Wines profit since launching the Yellow Tail wines

The economics of Yellow Tail rely almost entirely on the exchange rate between the Australian $ and the US $. The company is reported as being "comfortable" with the A$ trading up to US85¢, and "happy" with anything below US90¢, as the cost of making the wine (in Australia, in A$) is then more than compensated by the sales price (in US$, in the USA). When the brand was first launched, the Australian dollar was trading at around US57¢, and the wine thus made a tidy profit for the winery; and also for the distributor, Deutsch Family Wine and Spirits (see The Yellow Tail story: how two families turned Australia into America’s biggest wine brand).

However, Casella then suffered badly when the A$ began to improve in value over the next few years. The A$ reached parity with the US$ in July 2010; and this is the reason for the unprofitable years shown in the graph. The increased profit in 2010-2011 was apparently due to some successful currency hedging, rather than currency improvements.

Casella refused to change the bottle price of the Yellow Tail wines during the "bad times", stating that they did not want to risk losing their sales momentum by imposing a price hike. Instead, the company used its accumulated profits and, most importantly, re-negotiated its loans, in order to wait for a better exchange rate. They reported that every 1¢ movement in the currency equated to around $A2 million in higher sales revenue.

However, realizing the economic risks of relying on currency exchange-rates for profits, Casella embarked on a premiumization strategy in 2014. The company has since bought a number of vineyards in premium Australian wine-making regions, mainly in South Australia, as well as acquiring some top-notch wine companies, including Peter Lehmann Wines, Brands Laira, and Morris Wines. This strategy is continuing to this day (see Bloomberg).

Finally, for those of you who might be concerned about these things, while the winery does have some vegan wines, the three Casella brothers are reported to all be keen shooters, one of them has actually owned an ammunition factory, and the winery is the largest corporate sponsor of the Sporting Shooters Association. Moreover, Marcello Casella has made a number of court appearances concerning his ammunition factory (Bronze Wing Ammunition factory to remain closed after WorkCover court win) and alleged involvement in drug running (see NSW South Coast drug kingpin Luigi Fato jailed for 18 years).

The recent embarrassment at the SuperBowl is best left undiscussed!

Monday, 10 July 2017

Napa cabernet grapes are greatly over-priced, even for Napa

There have been a number of recent comments on the Web about the increasing cost of cabernet sauvignon grapes from the Napa viticultural district (eg. Napa Cabernet prices at worryingly high levels). These comments are based on the outrageously high prices of those grapes compared to similar grapes from elsewhere in California. On the other hand, some people seem to accept these prices, based on the idea that Napa is the premier cabernet region in the USA.

However, it is easy to show that the Napa cabernet grape prices are way out of line even given Napa's reputation for high-quality cabernet wines.


The data I will use to show this come from the AAWE Facebook page: Average price of cabernet sauvignon grapes in California 2016. This shows the prices from last year's Grape Crush Report for each of 17 Grape Pricing Districts and Counties in California. The idea here is to use these data to derive an "expected" price for the Napa district based on the prices in the other 16 districts, so that we can compare this to the actual Napa price.

As for my previous modeling of prices (eg. The relationship of wine quality to price), the best-fitting economic model is an Exponential model, in this case relating the grape prices to the rank order of those prices. This is shown in the first graph. The graph is plotted with the logarithm of the prices, which means that the Exponential model can be represented by a straight line. Only the top five ranked districts are labeled.

Prices of California cabernet sauvignon grapes in 2016

As shown, the exponential model accounts for 98% of the variation in the rank order of the 16 grape districts, which means that this economic model fits the data extremely well. For example, if the Sonoma & Marin district really does produce better cabernet grapes than the Mendocino district, then the model indicates that their grapes are priced appropriately.

Clearly the Napa district does not fit this economic model at all. The model (based on the other 16 districts) predicts that the average price of cabernet grapes in 2016 should have been $3,409 per ton for the top ranked district. The Napa grapes, on the other hand, actually cost an average of $6,846, which is almost precisely double the expected price. This is what we mean when we say that something is "completely out of line"!

In conclusion, 16/17 districts have what appear to be fair average prices for their cabernet sauvignon grapes, given the current rank ordering of their apparent quality. Only one district is massively over-pricing itself. Even given the claim that Napa produces the highest quality cabernet wines in California, the prices of the grapes are much higher than we expect them to be. Something really has gotten out of hand.

Part of the issue here is the identification of prime vineyard land, for whose grapes higher prices are charged (see As the Grand Crus are identified, prices will go even higher). The obvious example in Napa is the To Kalon vineyard (see The true story of To-Kalon vineyard). Here, the Beckstoffer "pricing formula calls for the price of a ton of To Kalon Cabernet grapes to equal 100 times the current retail price of a bottle" of wine made from those grapes (The most powerful grower in Napa). This is a long-standing rule of thumb, and it explains why your average Napa cabernet tends to cost at least $70 per bottle instead of $35.

Anyway, those people who are recommending that we look to Sonoma for value-for.money cabernet wines seem to be offering good advice.

Vineyard area

While we are on the topic of California cabernets, we can also briefly look at the vineyard area of the grapes. I have noted before that concern has been expressed about the potential domination of Napa by this grape variety (see Napa versus Bordeaux red-wine prices), but here we are looking at California as a whole.

A couple of other AAWE Facebook pages provide us with the area data for the most commonly planted red (Top 25 red grape varieties in California 2015) and white (White wine grapes in California 2015) grape varieties in 2015. I have plotted these data in the next two graphs. Note that the graphs are plotted with the logarithm of both axes. Only the top four ranked varieties are labeled.

Area of red grape varieties in California in 2015
Area of white grape varieties in California in 2015

On the two graphs I have also shown a Power Law model, as explained in previous posts (eg. Do sales by US wine companies fit the proverbial "power law"?). This Power model is represented by a straight line on the log-log graphs. As shown, in both cases the model fits the data extremely well (97% and 98% of the data are fitted), but only if we exclude the three most widespread grape varieties. Note, incidentally, that there is slightly more chardonnay state-wide than there is cabernet sauvignon.

The model thus implies that there is a practical limit to how much area can be devoted readily to any one grape variety — we cannot simply keep increasing the area indefinitely, as implied by the expectation from the simple Power model. The data shown suggest that this limit appears to be c. 40,000 acres, at least for red grape varieties (ie. increase in vineyard area slows once this limit is reached).

Both chardonnay and cabernet sauvignon have twice this "limit"area, which emphasizes their importance in the California grape-growing economy. However, the Power-law model indicates that we cannot yet claim that the domination by these grapes is anything unexpected.

Monday, 3 July 2017

Awarding 90 quality points instead of 89

I have written before about the over-representation, by most wine commentators, of certain wine-quality scores compared to others. For example, I have discussed this for certain wine professionals (Biases in wine quality scores) and for certain semi-professionals (Are there biases in wine quality scores from semi-professionals?); and I have discussed it for the pooled scores from many amateurs (Are there biases in community wine-quality scores?). It still remains for me to analyze some data for the pooled scores of professionals as a group. This is what I will do here.


The data I will look at is the compilation provided by Suneal Chaudhary and Jeff Siegel in their report entitled Expert Scores and Red Wine Bias: a Visual Exploration of a Large Dataset. I have discussed these data in a previous post (What's all this fuss about red versus white wine quality scores?). The data are described this way:
We obtained 14,885 white wine scores and 46,924 red wine scores dating from the 1970s that appeared in the major wine magazines. They were given to us on the condition of anonymity. The scores do not include every wine that the magazines reviewed, so the data may not be complete, and the data was not originally collected with any goal of being a representative sample.
This is as big a compilation of wine scores as is readily available, and presumably represents a wide range of professional wine commentators. It is likely to represent widespread patterns of wine-quality scores among the critics, even today.

In my previous analyses, and those of Alex Hunt, who has also commented on this (What's in a number? Part the second), the most obvious and widespread bias when assigning quality scores on a 100-point scale is the over-representation of the score 90 and under-representation of 89. That is, the critics are more likely to award 90 than 89, when given a choice between the two scores. A similar thing often happens for the score 100 versus 99. In an unbiased world, some of the "90" scores should actually have been 89, and some of the "100" scores should actually have been 99. However, assigning wine-quality scores is not an unbiased procedure — wine assessors often have subconscious biases about what scores to assign.

It would be interesting to estimate just how many scores are involved, as this would quantify the magnitude of these two biases. Since we have at hand a dataset that represents a wide range of commentators, analyzing this particular set would tell us about general biases, not just those specific to each individual commentator.

Estimating the biases

As in my earlier posts, the analysis involves frequency distributions. The first two graphs show the quality-score data for the red wines and the white wines, arranged as two frequency distributions. The height of each vertical bar in the graphs represents the proportion of wines receiving the score indicated.

Frequency histogram of red wine scores

Frequency histogram of white wine scores

The biases involving 90 versus 89 are clear in both graphs; and the bias involving 100 is clear in the graph for the red wines (we all know that white wines usually do not get scores as high as for red wines — see What's all this fuss about red versus white wine quality scores?).

For these data, the "expectation" is that, in an unbiased world, the quality scores would show a relatively smooth frequency distribution, rather than having dips and spikes in the frequency at certain score values (such as 90 or 100). Mathematically, the expected scores would come from an "expected frequency distribution", also known as a probability distribution (see Wikipedia).

In my earlier post (Biases in wine quality scores), I used a Weibull distribution (see Wikipedia) as being a suitable probability distribution for wine-score data. In that post I also described how to use this as an expectation to estimate the degree of bias in our red- and white-wine frequency distributions.

The resulting frequency distributions are shown in the next two graphs. In these graphs, the blue bars represent the (possibly biased) scores from the critics, and the maroon bars are the unbiased expectations (from the model). Note that the mathematical expectations both form nice smooth distributions, with no dips or spikes. Those quality scores where the heights of the paired bars differ greatly are the ones where bias is indicated.

Frequency histogram of modeled red wine scores

Frequency histogram of modeled white wine scores

We can now estimate the degree of bias by comparing the observed scores to their expectations. For the red wines, a score of "90" occurs 1.53 times more often than expected, and for the white wines it is 1.44 times. So, we can now say that there is a consistent bias among the critics, whereby a score of "90" occurs c.50% more often than it should. This is not a small bias!

For a score of "100" we can only refer to the red-wine data. These data indicate that this score occurs more than 8 times as often as expected from the model. This is what people are referring to when they talk about "score inflation" — the increasing presence of 100-point scores. It might therefore be an interesting future analysis to see whether we can estimate any change in 100-point bias through recent time, and thereby quantify this phenomenon.

Finally, having produced unbiased expectations for the  red and white wines, we can now compare their average scores. These are c.91.7 and c.90.3 for the reds and whites, respectively. That is, on average, red wines get 1⅓ more points than do the whites. This is much less of a difference than has been claimed by some wine commentators.

Conclusion

Personal wine-score biases are easy to demonstrate for individual commentators, whether professional or semi-professional. We now know that there are also general biases shared among commentators, whether they are professional or amateur. The most obvious of these is a preference for over-using a score of 90 points, instead of 89 points. I have shown here that one in every three 90-point wines from the professional critics is actually an 89-point wine with an inflated score. Moreover, the majority of the 100-point wines of the world are actually 99-point wines that are receiving a bit of emotional support from the critics.

Monday, 26 June 2017

What happened to Decanter when it changed its points scoring scheme

In a previous post (How many wine-quality scales are there?), I noted that at the end of June 2012 Decanter magazine changed from using a 20-point ratings scale to a 100-point scale for its wine reviews (see New Decanter panel tasting system). In order to do this, they had to convert their old scores to the new scores (see How to convert Decanter wine scores and ratings to and from the 100 point scale).

It turns out that there were some unexpected consequences associated with making this change, which means that this change was not as simple as it might seem. I think that this issue has not been appreciated by the wine public, or probably even the people at Decanter, either; and so I will point out some of the consequences here.


We do expect that a 20-point scale and a 100-point scale should be inter-changeable in some simple way, when assessing wine quality. However, there is actually no intrinsic reason why this should be so. Indeed, Wendy Parr, James Green and Geoffrey White (Revue Européenne de Psychologie Appliquée 56:231-238. 2006) actually tested this idea, by asking wine assessors to use both a 20-point scale and a 100-point scale to evaluate the same set of wines. Fortunately, they found no large differences between the use of the two schemes, for the wines they tested.

This makes it quite interesting that when Decanter swapped between its two scoring systems it did seem to change the way it evaluated wines. This fact was discovered by Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180), in 2015, when they looked at the scores for the red wines of Bordeaux.

Cardebat & Paroissien looked at how similar the quality scores were for a wide range of critics, and then compared them pairwise using correlation analysis. If all of the scores between any given pair of critics were closely related then their correlation value would be 1, and if they were completely different then the value would be 0; otherwise, the values vary somewhere in between these two extremes. Cardebat & Paroissien provide their results in Table 3 of their publication.

Of interest to us here, Cardebat & Paroissien treated the Decanter scores in two groups, one for the scores before June 2012, which used the old 20-point system, and one for the scores after that date, which used the new 100-point system. We can thus directly compare the Decanter scores to those of the other critics both before and after the change.

I have plotted the correlation values in the graph below. Each point represents the correlation between Decanter and a particular critic  — four of the critics have their point labeled in the graph. The correlation before June 2012 is plotted horizontally, and the correlation after June 2012 is plotted vertically. If there was no change in the correlations at that date, then the points would all lie along the pink line.

Change in relationship to other critics when the scoring system was revised

For two of the critics (Jeff Leve and Jean-Marc Quarin), there was indeed no change at all, exactly as we would expect if the 20-point system and 100-point system are directly inter-changeable. For seven other critics the points are near the line rather than on it (Tim Atkin, Bettane & Desseauve, Jacques Dupont, René Gabriel, Neal Martin, La Revue du Vin de France, Wine Spectator), and this small difference we might expect by random chance (depending, for example, on which wines were included in the dataset).

For the next two critics (Robert Parker, James Suckling), the points seem to be getting a bit too far from the line. At this juncture, it is interesting to note that the majority of the points lie to the right of the line. This indicates that the correlations between Decanter and the other critics were greater before June 2012 than afterwards. That is, Decanter started disagreeing with the other critics to a greater extent after they adopted 100 points than before; and they started disagreeing with Parker and Suckling even more than the others.

However, what happens with the remaining two critics is quite unbelievable. In the case of Jancis Robinson, before June 2012 Decanter agreed quite well with her wine-quality evaluations (correlation = 0.63), although slightly less than for the other critics (range 0.63-0.75). But afterwards, the agreement between Robinson and Decanter plummeted (correlation = 0.36). The situation for Antonio Galloni is the reverse of this — the correlation value went up, instead (from 0.32 to 0.56). In the latter case, this may be an artifact of the data, because only 13 of Galloni's wine evaluations before June 2012 could be compared to those of Decanter (and so the estimate of 0.32 may be subject to great variation).

What has happened here? Barring errors in the data or analyses provided by Cardebat & Paroissien, it seems quite difficult to explain what has happened here. Mind you, I have shown repeatedly that the wine-quality scores provided by Jancis Robinson are usually at variance with those of most other critics (see Poor correlation among critics' quality scores; and How large is between-critic variation in quality scores?), but this particular example does seem to be extreme.

For the Cardebat & Paroissien analyses, both Jancis Robinson and Antonio Galloni have the lowest average correlations with all of the other critics, with 0.46 and 0.45, respectively, compared to a range of 0.58-0.68 for the others. So, in this dataset there is a general disagreement between these two people and the other critics, and also a strong disagreement with each other (correlation = 0.17). It is thus not something that is unique to Decanter, but it is interesting that the situation changed so dramatically when Decanter swapped scoring schemes.

References

Jean-Marie Cardebat, Emmanuel Paroissien (2015) Reducing quality uncertainty for Bordeaux en primeur wines: a uniform wine score. American Association of Wine Economists Working Paper No. 180.

Wendy V. Parr, James A. Green, K. Geoffrey White (2006) Wine judging, context and New Zealand sauvignon blanc. Revue Européenne de Psychologie Appliquée 56:231-238.

Monday, 19 June 2017

Yellow Tail — wine imports into the USA do fit a "power law"

Some weeks ago I posted a discussion of whether sales by US wine companies fit the proverbial "power law". The Power Law is used to describe phenomena where large events are rare but small ones are quite common. I concluded that US wine sales in 2016 did, indeed, fit a Power Law, with the exception of the largest company, E&J Gallo Winery. To fit in with the rest of the wine companies, E&J Gallo should have sold c. 3.5 times as much wine as it actually did sell. Apparently, it is rather hard to dominate US domestic wine sales in the way predicted by a simple Power Law.


Power Laws are of interest because of their practical consequences. For example, the 80:20 Rule (or Pareto Principle) is one example of a Power Law, which says that for many events, roughly 80% of the effects come from 20% of the causes.

Power Laws are considered to be universal, and so there is no reason why they should not exist in the wine industry. One of the more obvious places that we might expect to find them is in wine sales — there are likely to be a few wines that sell very well and lots of smaller sales. As I showed in the earlier post, this appears to be generally true for domestic wine production in the USA; and so it is of interest to see whether it also applies to imported wines.

Yellow Tail and the Power Law

Currently, the biggest-selling imported wine in the USA is Yellow Tail (from Casella Wines, in Australia), with more than 8 million cases shipped to the US per year. This would place it at no. 9 in the current Wine Business Monthly top-30 list of wine companies in the USA. In July 2016, The Drinks Business placed Yellow Tail at no. 6 in its list of the Top 10 biggest-selling wine brands in the world, based on sales in 2015.

Unfortunately, I do not have a list of the sales of imported wine in the USA for any of the most recent years. However, in a presentation at the U.S. Beverage Alcohol Forum, which is part of the Wine & Spirits Wholesalers of America annual convention, Mike Ginley provided the US sales data for the top 25 imported table-wine brands in 2012. So, I will use this dataset for the analysis.

As I noted for for the previous analysis, one special case of the Power Law is known as Zipf's Law, which refers to the "size" of each event relative to it's rank order of size. This is what we are looking at here. For each wine brand, the "size" is the number of cases of wine sold during 2012, and the brands are listed in rank order of their sizes (largest to smallest). The standard way to evaluate the Zipf pattern is to plot the data with both axes of the graph converted to logarithms. Under these circumstances, the data should form a straight line.

Here is the graph of the 2012 sales data for the top 25 imported wine brands. Only the best-selling wine is labeled.

A Power Law fitted to the sales of wines imported to the USA

As you can see, all of the data lie roughly along a straight line, and thus do indeed fit a Power Law. That is what we would expect.

However, it is worth noting here that all of the wine brands do fit the same Power Law, including Yellow Tail. This is different from what we found for the domestic wines (where the no. 1 winery under-performed relative to the Power Law model). Indeed, the Power Law indicates that Yellow Tail actually sold 28% more cases than would be expected from the sales of the other wine brands. So, in 2012 Yellow Tail slightly out-performed the expectation from the mathematical model, whereas E&J Gallo greatly under-performed the expectation in 2016.

It is also worth noting the presence in the 2012 top-25 list of some of the best-selling wines from 30 years earlier. The data for 1980 and 1981 are provided in an article from the New York Times (Lambrusco rates high with U.S. consumers). The imported wine brands that have managed to hang on over the decades are: Riunite (no. 5 in 2012, but no. 1 back in 1980 & 1981), Folonari (12 now vs. 4 then), Bolla (18 vs. 3) and Cella (20 vs. 2). In 2012, these brands sold only 20-50% of their 1981 case sales, which is why they have dropped down the ranking.

Previous top-10 imported wine brands that have fallen by the wayside include: Zonin, Giacobazzi, Blue Nun, Mateus, Yago, and Lancers. Perhaps you remember some of them?

Monday, 12 June 2017

Why is wine often cheaper in Sweden than elsewhere?

In spite of considerable complaining by certain Swedes, a lot of wines are cheaper in Sweden than elsewhere in the European Union (EU), particularly European wines. Furthermore, Australian wines are sometimes cheaper in Sweden than they are in Australia; and occasionally even US wines can be cheaper than in the USA. This happens as a direct result of wine retail economics, and the fact that Sweden has a single government-owned retail chain for alcohol sales.

Not all wine is cheaper in Sweden, of course, but the ones I am interested in usually are cheaper; and so I thought that I would write about it.


The bottle shop / liquor store / off-licence (depending on your English idiom) is called Systembolaget (which translates as The System Company), and is wholly owned by and operated on behalf of the Swedish government. It has a monopoly on retail sales in Sweden, but not trade sales (for which there are several hundred importers), nor private imports from elsewhere within the EU.

Since I live in Sweden, I principally get my wine from Systembolaget, but I also get wine sent to me from elsewhere in the EU. I often read reviews written by people in the USA, and check up on their recommendations; and I am interested in Australian wine, since that is what I learned first. It is for these reasons that I am familiar with the prices of wines both inside and outside Sweden, and I can thus make direct comparisons of the prices of the same wine in several countries.

I therefore make the categorical statement that fine wine is cheaper in Sweden than most other places into which it is imported (see the example at the end of the post). But not cheap wine — that is often less expensive elsewhere.

Wine economics

Several people have looked at the economics of wine retail in the United Kingdom, but not so many in the USA. The latter is possibly because bottle prices can vary from state to state, due to differences in taxes, plus the economics of the three-tier distribution system. Economics in the USA is not always a simple thing!

So, as my example of the economics of wine retail sales, I will use the UK, because the situation is simpler. As far as I can tell, the basic economics are no different in most other places, although the actual percentages will vary somewhat. [Note: The UK government has recently announced an increase in excise duty on alcohol; and the Average price of bottle of wine in UK has reached a new high thanks to Brexit. Neither of these facts affects my analysis.]

The economic breakdown of the price of a bottle of wine in the UK has been dissected independently on several blogs:
The Bibendum analysis has been updated yearly, and so I will use their data for March 2017:
Their analysis breaks down the bottle cost into these components: retailer margin, excise duty, value added tax (VAT), packaging, logistics, and the wine itself. They do this for bottles with four different retail prices.

In the first graph I have plotted the percentage of the UK final bottle price that goes to the retailer and to the winery. For comparison, £10 ≈ $12 ≈ 110 kronor.

Retailer and manufacturer margins for a bottle of wine in the UK

As you can see, the margins for the retailer and manufacturer increase as the bottle price increases — neither of them makes as much money on a cheap bottle of wine as they do on an expensive wine, both in straight money terms and as a percentage profit. Furthermore, the retailer is the one making the most money on wines less than about £15 ($20).

This same economics may not directly apply to large supermarket chains, which frequently market their own-label wines. In these cases, the relationship between the manufacturer and the retailer is blurred. This also applies in the USA, where it has been noted (Reverse Wine Snob, by Jon Thorsen. 2015):
Costco's average margin (per their financial filings) is about 12 percent. Costco has stated that the highest margin they will take on a non-Costco brand is 13 percent and they strive to keep it closer to 10 percent. On private label items (Kirkland Signature) they will go up to 15 percent margin, but of course the price is still lower than other brands because they cut out the middleman.
Sweden

We can now compare the UK economic model to that used by Systembolaget in Sweden. Their model has been a fixed price per bottle, which differs for different products (beer/cider, wine, spirits), plus a fixed percentage. Up to 1 March 2017, the fixed price for wine was 3.5 kr (£0.3) + 19%; from that date it has been 5.2 kr (£0.45) + 17.5%. I have plotted both of these models onto the next graph.

Retailer and manufacturer margins for a bottle of wine in the UK and Sweden

It is now easy to see why wine is cheaper in Sweden, except for the most inexpensive wines. If we define "good wine" as anything above £10 ($12), then Swedes are doing very well, indeed; and the more expensive the wine, the better off they are. The reason for this is quite straightforward — Systembolaget's stated goal is: "To minimize alcohol-related problems by selling alcohol in a responsible way, without profit motive." Needless to say, I am quite pleased with this situation, as a buyer of fine wines.

However, it is also easy to see why a lot of Swedes might complain. They are no different to wine drinkers anywhere else, and therefore a lot of wine purchases are at the inexpensive end of the market. For example, according to Systembolaget, in the first 3 months of this year 35% of wine sales were less than 80 kr (£7, $9) per bottle. At this price, wine in Sweden is not as cheap as elsewhere, and Swedes know it; and as you can see in the graph, it recently got noticeably more expensive, as well.

Systembolaget addresses this issue by virtue of being one of the largest alcohol retail chains in the world (reportedly third, behind Tesco, in the UK, and the Liquor Control Board of Ontario, in Canada). This position gives it a lot of bargaining power with the manufacturers and importers. In fact, Systembolaget puts a lot of the most inexpensive wines directly out to tender (as do their equivalents, ALKO, in Finland, and Vinmonopolet, in Norway) — you can see the current list of tenders here. (Note that not everyone is necessarily impressed with this idea.)

Finally, it is worth noting that most of the other bottle costs are similar in Sweden and the UK. For example, the excise duty that is imposed on alcohol in the UK is currently a fixed £2.16 per bottle of wine, while the Swedish alcohol tax is currently 26 kr (£2.30). However, the UK goods and services tax (VAT) is 20%, compared to the Swedish (moms) of 25% — this government tax significantly offsets the reduced retailer margin in Sweden. Sigh.

Note: The excise rates for alcohol in Sweden and the UK are among the highest in the EU, along with Ireland and Finland (see AAWE). On the other hand, EU goods and services taxes generally vary between 20 and 25%.

Example

The next graph shows the advertized price (on April 14, 2017) of a single bottle of Seghesio Family Vineyards Cortina Zinfandel 2013 (from California), for eight US stores, three UK stores, and Systembolaget. The Swedish price includes delivery to the nearest service point in Sweden (438 shops plus c. 500 drop-off locations), but the others exclude delivery.


The US price depends on the store location, with the highest price being 25% greater than the lowest price. The Swedish price is equal to the maximum US price, while being 5-10% less than the UK prices.

Monday, 5 June 2017

How many 100-point wine-quality scales are there?

In the previous post (How many wine-quality scales are there?) I discussed the range of ratings systems for describing wine quality that use 20 points. However, perhaps of more direct practical relevance to most wine drinkers in the USA is the range of systems that use 100 points (or, more correctly, 50-100 points).

The 100-point scale is used by the most popular sources of wine-quality scores, including the Wine Spectator, Wine Advocate and Wine Enthusiast; and so wine purchasers encounter their scores almost every time they try to purchase a bottle of wine. But how do these scores relate to each other? Using the metaphor introduced in the previous post, how similar are their languages? And what do we have to do to translate between languages?


All three of these popular scoring systems have been publicly described, although I contend that it might be a bit tricky for any of the rest of us to duplicate the scores for ourselves. However, there are plenty of other wine commentators who provide scores without any explicit indication of how they derive those scores. This means that some simple comparison of a few of the different systems is in order.

As explained in the last post, in order to standardize the various scales for direct comparison, we need to translate the different languages into a common language. I will do this in the same manner as last time, by converting the different scales to a single 100-point scale, as used by the Wine Advocate. I will also compare the quality scales based on their scores for the five First Growth red wines of the Left Bank of Bordeaux, as I did last time.

The scales for nine different scoring systems are shown in the graph. The original scores are shown on the horizontal axis, while the standardized score is shown vertically. The vertical axis represents the score that the Wine Advocate would give a wine of the same quality. If the critics were all speaking the same language to express their opinions about wine quality, then the lines would be sitting on top of each other; and the further apart they are, the more different are the languages.

Nine different 100-point wine-scoring systems

There are lots of different lines here, which indicates that each source of scores uses a different scheme, and thus is speaking a different language. Many of the lines are fairly close, however, and thus many of the languages are not all that different from each other. Fortunately for us, they are most similar to each other in the range 85-95 points.

First, note that the line for the Wine Spectator lies exactly along the diagonal of the graph. This indicates that the Wine Advocate and the Wine Spectator are using exactly the same scoring system — they are speaking the same language. In other words, a 95-point wine from either source means exactly the same thing. If they give different scores to a particular wine, then they are disagreeing only about the quality of the wine — this is not true for any other pair of commentators, because in their case a different score may simply reflect the difference in language.

It is worth noting that almost all of the Wine Advocate scores came from Robert Parker, while most of the Wine Spectator's were from James Suckling, along with a few from Thomas Matthews, James Molesworth and Harvey Steiman (who have all reviewed the red wines of Bordeaux for that magazine), plus some that were unattributed.

Second, the line for the Wine Enthusiast always lies below the diagonal of the graph. This indicates that the Wine Enthusiast scores are slightly greater than those of the Wine Advocate (and Wine Spectator) for an equivalent wine. For example, if the Enthusiast gives a score of 80 then Parker would give (in the Advocate) 78-79 points for a wine of the same quality. This situation has been noted in Steve De Long's comparison of wine scoring systems, although it is nowhere near as extreme as he suggests.

Third, the line for Stephen Tanzer always lies above the diagonal of the graph, indicating that his scores are usually slightly less than those of the Wine Advocate (and Wine Spectator). Indeed, a 100-point Parker wine would get only 98-99 points from Tanzer.

All of the other lines cross the diagonal at some point. This indicates that sometimes their scores are above those of the Advocate and sometimes they are below. Interestingly, most of these systems converge at roughly 91 points, as indicated by the dashed line on the graph. So, a 91-point wine means more-or-less the same thing for most of these commentators (except Tanzer and the Enthusiast) — it is the only common "word" in most of the languages!

The most different of the scoring schemes is that of James Suckling, followed by those of Jeannie Cho Lee and Richard Jennings (which are surprisingly similar). Suckling is a former editor of Wine Spectator, and he actually provided most of the scores used here for that magazine — this makes his strong difference in scoring system on his own web site particularly notable, as it implies that he has changed language since departing from the Spectator.

Finally, it is important to recognize that all I have done here I have evaluate the similarity of the different scoring systems. Whether the scores actually represent wine quality in any way is not something I can test, although I presume that the scores do represent something about the characteristics of the wines. Nor can I evaluate whether the scores reflect wines that any particular consumer might like to drink, or whether they can be used to make purchasing decisions. Nor can I be sure exactly what would happen if I chose a different set of wines for my comparisons.

Conclusions

The short answer to the question posed in the title is: pretty much one for each commentator, although some of them are quite similar. Indeed, the Wine Spectator and the Wine Advocate seem to use their scores to mean almost the same thing as each other, while the Wine Enthusiast gives a slightly higher score for a wine of equivalent quality.

While there are not as many wine-quality rating systems as there are languages, the idea of translating among them is just as necessary in both cases, if we are to get any meaning. That is, every time a wine retailer plies us with a combination of critics' scores, we have to translate those scores into a common language, in order to work out whether the critics are agreeing with each other or not. Different scores may simply reflect differences in scoring systems not differences in wine quality; and similarity of scores does not necessarily represent agreement on quality.

Averaging the scores from the different critics, as is sometimes done, notably by Wine-Searcher and 90plus Wines, is unlikely to be a valid thing, mathematically. Given the results from this and the previous post (How many wine-quality scales are there?), calculating a mathematical average score would be like trying to calculate a mathematically average language. Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180. 2015) have correctly pointed out that the different scoring systems need to be converted to a common score (ie. a common language) before any mathematics can be validly applied to them.

Monday, 29 May 2017

How many wine-quality scales are there?

There are a number of ratings systems for describing wine quality, which use 100 points, 20 points, 5 stars, 3 glasses, etc. Unfortunately, there is usually no "gold standard" for these systems, and so no two wine commentators use these systems in quite the same way.

That is, when critics differ in their wine scores for a particular wine, it can be for one of two reasons: (i) their opinions on the wine's quality differ, or (ii) they are expressing their opinion using different numbers. That is, when the critics produce the same score, they may or may not be assessing the wine as having the same quality, and similarly when they produce different scores. Each critic has their own personal version of the "100-point scale" or the "20-point scale".


This situation is similar to people speaking different languages. Simply looking at a word does not necessarily tell you what language is being used, because the same combination of letters can occur in different languages, with or without the same meaning. For example, the word "December" appears in both Swedish and English, and in this case it has the same meaning in both languages. However, the word "sex" also appears in both languages, but in Swedish it usually refers to the number 6, which is not necessarily related to any of the word's possible meanings in English.

So, if the Wine Spectator gives a wine 90 points, does that mean the same thing as when the Wine Advocate gives that same wine 90 points? Probably not. Just for variety, instead of using the 100-point scale to illustrate this topic, I will use the 20-point scale for wine quality — this emphasizes the need to translate the ratings systems to a common one.

20-point ratings systems

Many American wine drinkers are familiar with the 20-point scale developed in the 1950s by Maynard Amerine and his colleagues at the University of California, Davis, intended as a teaching tool for identifying faulty wines. This was, indeed, an attempt to produce a "gold standard" wine rating system. Each organoleptic characteristic of the wine is assigned a number of points based on its perceived quality, and these points are summed to produce the final score. In both theory and practice, everyone who uses the UCDavis scale should be "speaking the same language"; and therefore any differences in wine scores should represent differences in wine quality, not differences in language.

Sadly, not everyone has agreed with or used the UCDavis scale, especially as a general tool for wine tastings; this topic is discussed in detail in recommended books such as those by Clive S. Michelsen (Tasting and Grading Wine. 2005) and Andrew Sharp (Winetaster's Secrets. 2005). So, there are innumerable 20-point scales in use around the world, and they all seem to represent different languages. To illustrate the range of scales in use, we can compare the scores given to the same wines by different critics.

In order to standardize the scales for direct comparison, we need to translate the different languages into a common language. Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180. 2015) have suggested doing this by converting the different scales to a single 100-point scale. The one they chose was the scale used by the Wine Advocate (which is not necessarily the same as that used by the Wine Spectator, or the Wine Enthusiast, etc), and I will do the same here. Furthermore, I will compare the quality scales based on their scores for the five First Growth red wines of the Left Bank of Bordeaux (as described in the post How large is between-critic variation in quality scores?).

The scales for five different commentators are shown in the first graph. The original scores are shown on the horizontal axis, while the standardized score is shown vertically. The vertical axis represents the score that the Wine Advocate would give a wine of the same quality. If the critics were all speaking the same language to express their opinions about wine quality, then the lines would be sitting on top of each other; and the further apart they are, the more different are the languages.

Five different 20-point wine-quality ratings systems

Also shown is the difference in meaning for a wine that gets a score of 18 from each of the critics. If we see a wine score of 18, then La Revue du Vin de France, Jean-Marc Quarin and Bettane et Desseauve mean a somewhat better wine than does Jancis Robinson. On the other hand, Vinum Weinmagazin is indicating a somewhat worse wine. They are, indeed, all speaking different languages; and we readers need to translate between these languages in order to get their meaning.

As another example, at the end of June 2012 Decanter magazine changed from using a 20-point ratings scale to a 100-point scale (see New Decanter panel tasting system). In order to do this, they had to convert their old scores to the new scores. They used a conversion that is precisely halfway between the scoring systems of Jancis Robinson and Bettane & Desseauve, as shown in the next graph (see How to convert Decanter wine scores and ratings to and from the 100 point scale). So, this is yet another different 20-point language.

Seven different 20-point wine-quality ratings systems

So far, I have assumed that there is a linear relationship between the scores from the different critics (ie. the graph lines are straight). However, in an earlier post (Two centuries of Bordeaux vintages) I suggested that the relationship between the Bordeaux scores from Tastet & Lawton and from Jeff Leve (the Wine Cellar Insider) is curved, instead. Indeed, The World of Fine Wine magazine explicitly indicates that their 20-point scoring system is non-linear, as shown in the second graph above. This makes for a very complex language translation, indeed.

As we shall see in the next post (How many 100-point wine-quality scales are there?), translating between 20- and 100-point scales is not straightforward, either.

Conclusions

The short answer to the question posed in the title is: pretty much one for each commentator. Fortunately, there are not quite as many wine-quality rating systems as there are languages. Nevertheless, the idea of translating among them is just as necessary in both cases, if we are to get any meaning.

Does all of this matter in practice? Quite definitely. Indeed, every time a wine retailer plies us with a combination of critics' scores, we have to translate those scores into a common language, in order to work out whether the critics are agreeing with each other or not. Since most of us are not doing this, we may well be fooling ourselves into seeing a false sense of agreement among those critics. The world of fine wine is more complex than most people realize, or would like.

Furthermore, this issue is at the heart of the objections that mathematicians have to simply averaging wine scores across different critics. If the critics are all using different ratings scales, then the average score has no mathematical meaning. That is, if the critics are speaking different languages, then what would the "average" of those languages mean? It would be gibberish, unintelligible to anyone, even if the combination of letters looks like it might be a real word. A classic example of this is the Judgment of Paris, from 1976, in which the "official" summed scores are meaningless, because the tasters were all using different versions of the 20-point scale (see A Mathematical Analysis of The Judgment of Paris). Note also, that the scores using the UCDavis scale are much higher than are the scores for the Judgment (see Was the Judgment of Paris repeatable?).

Monday, 22 May 2017

Lazy journalism

This week marks the first anniversary of this blog. This is an important milestone for most blogs; and I have averaged more than one substantial post per week during that time, as I approach 60 posts. By way of celebration, this post is a bit different to most of the others.

This blog usually deals with wine data in the form of numbers, but there are other forms of data that could be used instead. One of these is industry information, as presented by the media. Sometimes, this is more opinion than properly checked information.

Consider this example from The Fabulous Ladies' Wine Society:
Cumulus Wines is a true child of the 80’s. We reckon the owners must have been listening to UB40's Red, Red Wine on repeat on their walkman when they planted out over 500 hectares of vineyard in the barely known Orange region nearly 30 years ago and built a 10,000 tonne winery with storage capacity for 8 million litres of wine. But obviously they were on to something as it has definitely paid off!
Almost everything written there is nonsense, as also is much of what is said about the same winery at Just Wines.

Cumulus Estate Wines is itself very coy about the company's history, with its description giving the impression of one continuous flow of time. However, this is far from the truth, as indicated by media reports at the time of the various events, such as those from the Newcastle Herald, Chris Shanahan (of the Canberra Times), the Pierpont column (of the Australian Financial Review), the Wine Spectator and Wine Genius. The story is long and convoluted, so here goes.


The company's main vineyard area is south of a small town called Molong, which lies just inside the Orange viticultural area of eastern Australia (the vineyard actually straddles the region's border). The vineyard, called Little Boomey, was established by Peter Poolman in 1995, increasing in size to 508 hectares over the next three years. Capital for the development was raised from hundreds of small investors, with the intention that ownership would revert to Poolman’s company after the investors had leased the vines for 15 years. What was then called Southcorp Wines (Australia's biggest wine company) bought and vinified the majority of Little Boomey’s grape harvests.

The Central Highlands Wine Grape Project, as it was officially called, was actually a tax-driven investment scheme with several vineyard areas. It was merged into a new investment company called Cabonne Limited in 1998, which was publicly listed on the Australian Stock Exchange. In 2001, Cabonne took over Reynolds Wine Company (owned by Jon and Jane Reynolds), and changed its name to Reynolds Wines Limited in 2002. It set up its wine making at Cudal, south of Molong, where a high-tech 10,000-tonne capacity winery had been built.

Reynolds Wines soon went bankrupt, slipping into voluntary administration in August 2003. The problem seems to have been what is euphemistically called an "awkward corporate structure", rather than problems with either the winery or the wine business. The company owed AU$18 million in taxes — presumably, the Australian Tax Office wasn't convinced that the original vineyard schemes were truly tax-deductible (a decision that they also applied to other vineyard small-investor schemes).

At the time, the subscribers to the original tax-minimization schemes apparently still owned, as license holders, the grapevines on Reynolds' three properties (reverting to Reynolds between 2012 and 2018), and also had rights to the wine made from those vines. The wine was concurrently being sold through a joint venture with the Trinchero group, from the Napa Valley in California (currently the fourth biggest winery in the USA, by case sales). As a result, Trinchero Family Estates acquired the Reynolds and Little Boomey brand names early in 2004, but had no interest in buying either the winery's production facility or its 900 hectares of vineyards.

The bankruptcy receiver (appointed by the ANZ Bank) sold the Cudal winery, the adjacent 508 hectares of vineyard and other assets to Cumulus Wines Proprietary Limited for AU$30 million — much less than the AU$130 million that Cabonne is reported to have invested in developing the property. Cumulus agreed to underwrite the bank loan only, which means that none of the investors got their money back, neither the original grape leasers nor those later investing via stock-exchange shares (ie. the bank came out of this okay but no-one else did!).

The Cumulus Wine company had been set up in 2004 by an underwriter and insurer called Assetinsure Proprietary Limited (50% owned by investment bank Babcock & Brown), based in Sydney. Philip Shaw (former Southcorp head of production) was appointed as the winemaker to develop the new wine company, focusing on cool-climate grapes from Orange and elsewhere in the Central Ranges viticultural area. The Little Boomey vineyard was re-named Rolling. In 2005, Keith Lambert (another former Southcorp chief executive) acquired a 51% stake in the company.

A worldwide distribution network was established. However, this proved to be overly ambitious, in spite of grants from the Export Market Development Grants Scheme, from the Australian government. So, in 2007 the Berardo wine family, of Portugal, bought the 51% share-holding. The Berardo Group has extensive wine investments in Portugal, via the Bacalhôa Vinhos de Portugal group, a 33% stake in Sogrape (Portugal’s largest wine company), 25% of Henriques & Henriques Lda (of Madeira), and joint ownership of Quinta do Carmo (with Eric de Rothschild, of Château Lafite), as well as owning 50% of Colio Estate Wines (one of Canada’s major wine producers).

This partnership between Assettinsure and the Berardo Group lasted for some; and in 2013 they launched a new wine sales and distribution company, Epoch Wine Group. [Don't worry, you are now well over half-way through the saga.]


However, in 2015 Cumulus Wines was involved in a scrip-for-scrip merger (ie. shares were exchanged instead of cash) with the trading company Wine Insights Proprietary Limited. This company owns Beelgara Estate, from the Riverina viticultural area (south-west of Orange), as well as making wine from the viticultural areas of Margaret River (Moss Brothers label), Coonawarra (Riddoch Run), Mudgee (Frog Rock), Adelaide Hills (Em’s Table), Clare Valley, McLaren Vale and Yarra Valley, among othes. Beelgara Estate was formed in 2001, when a group of shareholders bought the 70-year old Rossetto Wines company, including its winery at Beelbangera, just outside Griffith. This company had then merged with Australian Wine Supply in 2004, and the Wine Insights company was created in 2012, following further acquisitions and partnerships (including contract wine-making and bottling, and bulk wine supply).

The Cumulus merger is reported to have created a joint venture producing, per year, more than 400,000 cases of wine and with a gross revenue of AU$20 million. Winetitles Media now ranks Wine Insights as the 15th largest Australian wine company by revenue (and sales of branded wine) and 20th by wine-grape intake.

However, the venture also put the Rossetto winery, at Beelbangera, up for sale, because the merged group chose to centralize its wine production at the Cumulus winery, at Cudal. This seems to mean that the Riverina grapes are now going to be transported 350 km to be processed (at Cudal), rather than being processed locally (at Beelbangera). Much worse, the Margaret River grapes would be transported 4,000 km for processing, the Coonawarra and Adelaide Hills grapes would be trucked 1,000 km, etc. Environmentally friendly this would not be (with a large carbon footprint), although the accountants must love it.

That's it, for the moment. Nothing stays the same for long in the world of Australia's large wine companies. But the next time you read a media report about some wonderful winery, you should wonder what is the reality behind it.

Monday, 15 May 2017

Opus One, and the argument for varietal diversity

The red wines from Bordeaux contain one or more of several grape varieties: Cabernet sauvignon, Cabernet franc, Merlot, Malbec and Petit verdot. (They used also to contain Carménère, but that grape is now rare in Bordeaux.) When Robert Mondavi and Philippe de Rothschild decided to make a Bordeaux-style wine from Napa-grown grapes, they naturally used these same varieties.

This wine has been known as Opus One, with its first vintage in 1979. It was the first ultra-premium wine from the USA, the California equivalent of a Bordeaux first growth, intended as a benchmark for the wines produced from cabernet grapes in the Napa Valley. It has struggled to maintain that reputation, as it has been persistently criticized for inconsistency from vintage to vintage. Certainly, other wines have surpassed it in price and/or reputation (e.g. Ridge Monte Bello has a similar Bordeaux-style aim), although they all sell considerably fewer than the 25,000 annual cases of Opus One.

This inconsistency bears looking into. I contend that it has at least something to do with the variation in grape varieties.


The wine started out as a blend of mainly cabernet sauvignon, along with some cabernet franc and merlot. Then, malbec was added to the blend in 1994, and petit verdot was added from 1997 onwards. The proportion of these grape varieties in the wine has varied from year to year, as determined by the winemakers. The winemakers were Tim Mondavi and Lucien Sionneau from 1979–1984, and Tim Mondavi and Patrick Léon from 1985–2000, with Genevieve Janssens assisting from 1991–1997. Michael Silacci has been the chief winemaker since 2001, early on with either Tom Mondavi or Philippe Dhalluin, but alone since 2004.

In this blog post I wish to look at the variation through time in the diversity of the grape varieties within the wine. A number of mathematical measurements of diversity have been developed in science, for making precisely this sort of comparison. The idea is to reduce the proportions of the various grape varieties down to a single number (for each vintage) that quantifies their diversity, from a single grape variety at one mathematical extreme to equal amounts of each grape variety at the other extreme.

The one I will use here is called the Shannon Diversity Index (see Wikipedia). This Index will be a number between 0 (for a single grape variety) and the natural logarithm of 5 (for equal amounts of each of the 5 varieties). The data for each vintage come from the Opus One web site. The variation in Shannon diversity is shown in the first graph, with the vintages plotted horizontally and the diversity plotted vertically.

Grape diversity through time for the Opus One wine

This graphs shows that there was a lot of variability between the first few vintages, while the winemakers worked out what wine style they were aiming for. Furthermore, from the early 1990s onwards the diversity has steadily increased. This has been partly the result of using five grape varieties, as opposed to the original three, but it is mainly a result of using greater proportions of the minor varieties. In the early years, there were vintages composed of >95% cabernet sauvignon, but over the past 10 years it has been closer to 80%. For the rest of the grapes, it has been c.7% merlot, c.6% cabernet franc, c.6% petit verdot, and c.1% malbec.

Having established that the winemakers have been moving towards a greater diversity of grape varieties in Opus One, we can now ask whether this has improved the wine quality in the eyes of the drinkers. There have, of course, been a number of retrospective tastings of the vintages of Opus One, which is getting closer to its 40th vintage. It therefore seems worthwhile to see whether the quality scores given to these wines are associated in any way with the particular mixture of grape varieties that have been included in the wine over the years.

The most complete vertical tasting that I have been able to find is that of Antonio Galloni, from 2013, which included all of the vintages from 1979–2010. Sadly, the best vintage of all has been suggested to be the 2013, which misses out. In the next graph I have plotted Galloni's quality scores (vertically) against the Shannon diversity (horizontally), with each point representing a single vintage.

Wine quality and grape diversity for the Opus One wine

The graph shows a general increase in quality score with increasing diversity, with four exceptions (as labeled in in the graph). Excluding these four vintages for the moment, a correlation analysis shows that 42% of the variation in the wine quality score is associated with the grape diversity score. That is, increasing the diversity of the grape varieties in the wine has generally improved the quality, which is presumably what the winemakers have intended.

The four exceptions are instructive. The 1980, 1984 and 1987 vintages consisted almost entirely of cabernet sauvignon (>95%), and this has obviously been a very erratic strategy in terms of wine quality (sometimes it worked and sometimes it didn't). Furthermore, the 1980 and 1984 vintages consisted solely of cabernet sauvignon and cabernet franc, with no merlot at all. On the other hand, the 2006 vintage had the highest proportion of merlot yet, at 12%, which is double the usual amount. This created a high diversity value but obviously not a high quality score from Galloni. The winemakers again tried such a high proportion of merlot for the 2011 vintage (11%), and Galloni's preliminary score for the resulting wine (not yet released) indicated that he didn't think it had worked then, either.

This pattern could, of course be unique to Antonio Galloni — I have repeatedly pointed out that wine critics rarely agree much with each other about wine quality (see How large is between-critic variation in quality scores?). However few of the other professional commentators have conducted extensive vertical tastings of Opus One. So, by way of comparison, let's look at the opinions of a group of non-professionals.

In 2002, Bob Henry, a wine marketer from California, conducted a group tasting of the first 20 vintages of Opus One (1979–1998). Each of the 23 tasters was asked to rank their top three wines, with 3 points being assigned to the top wine, 2 points to the second wine, and 1 point to the third wine. These scores were then summed across the tasters, in order to rank the quality of the vintages. These results are compared to those of Galloni in the next graph, with each point representing one of the 20 wines.

Quality scores for vertical tastings of the Opus One wine

As you can see, only nine of the wines scored any points (ie. was a top-3 wine for any of the tasters). Most of these wines were also high-scoring wines for Galloni, and so we can treat this as a general confirmation of his scores. However, note that the 1980 vintage, which had a low grape-diversity score but still received 95 points from Galloni anyway, was not a high-scoring wine for the tasting group. This means that only the 1987 wine scored points but had a low grape-variety diversity. Indeed, the 1991 vintage was the only high-scoring wine before the introduction of malbec and petit verdot to the mix.

Conclusion

In biology (including agriculture), diversity is considered to be a Good Thing. Here, the Opus One wine seems to support this idea, as increasing diversity of grape varieties is associated with higher quality wines. Furthermore, the winemakers have been steadily increasing this diversity with each succeeding vintage. This is a strong argument for varietal diversity in wines. If nothing else, this helps explain the wine's reputation for inconsistency — poor vintages have generally arisen from reliance on too few grape varieties.