Monday, 20 November 2017

Wine collector fraud, and wine snobbery

The New Testament gospels warn us about the danger of putting new wine into old wineskins. This was a religious parable of Jesus, with several possible interpretations; but it has taken on a very different relevance in the modern world, with increasing incidences of collector fraud involving wines.

Counterfeit wine has been much in the media in recent weeks (eg. Wine maven Kurniawan, convicted of fraud, loses bid for freedom ; Billionaire Koch brother's crusade against counterfeit wine ; Why it’s so hard to tell if a $100,000 bottle of wine is fake ; Napa wine merchant accused of fraud in client's lawsuit). We have even gotten to the stage where there is fake news about allegedly fake wines (Penfolds hit by fake wine claims).

Discussion of these topics seems to range from outrage at the fraudster, through fascination with how it's done, to wondering how much of it has been done (eg. $100 million of counterfeit wine in circulation 20% of all wine in the world is fake). Among all of these news stories and commentaries, there is one general point that seems not to have been emphasized — wine collector fraud and wine consumption fraud are two different things. Furthermore, wine collector fraud requires a combination of massive wealth and massive snobbery on the part of the collectors — if there were no people with this combination of characteristics, then collector frauds would not even be conceived, let alone perpetrated.

There are two types of wine fraud

Fraud directed against wine collectors is a rather different thing from most other frauds, which are usually grouped as consumption fraud rather than collector fraud. Far too much of the wine discussion has failed to clearly distinguish these to types of fraud, which are clearly described by, for example, Lars Holmberg (2010. Wine fraud. International Journal of Wine Research 2: 105-113). The difference is very important, because consumers and collectors are very different people. The main purpose of this blog post is to call attention to this distinction.

Consumption wine fraud is usually directed at inexpensive or mid-price wines, and includes things like: misrepresenting the grape variety, grape origin or alcohol content; adulterating the wine with sugar, water, coloring, flavors, or something much worse (like glycol or lead); and running a retail ponzi scheme. These things can be done on a large scale, and they potentially affect all consumers. Collector fraud, on the other hand, usually involves luxury wines, and is directed almost solely at individuals with more money to pay for the wine than they have technical ability to correctly identify that wine.

In the latter case, irrespective of what we may feel about the fraudster, we should recognize that the collectors who bought the wines are ultimately victims of their own snobbery, and having the wealth to display that snobbery. Anyone who spends tens of thousands of dollars on a bottle of wine can only be doing so for the snob value of having people know that they did this (Campbell Mattinson: "the rich and powerful need something rich and powerful to spend their money on"). These are wine investors, not wine drinkers, and so we are actually talking about wine investment fraud, which is not too dissimilar to art investment fraud. This is a far cry from consumption frauds directed at wine drinkers in general.

Wine can be a good financial investment, of course, but only if you can authenticate the wine. This is a very hard and expensive thing to do. Perhaps these investors might consider some alternative means of disposing of their massive wealth? There are plenty of people besides fraudsters who would like the opportunity to make good use of the money; and many of these people actually perform publicly useful services, rather than the solely private one of enhancing investor snobbery.

Interestingly, there seems to have been no diminution of the prices of rare wines, in spite of all of the fuss about collector fraud (Q&A: François Audouze, wine collector). This illustrates the illogicality of luxury wine prices.


Wine snobbery comes in many guises. Snobs are conventionally considered to be those people who value exclusivity and status above everything else. However, there are alternative ideas about this characterization. For example, Jeany Miller (The parasitic nature of the wine fraud) has suggested that: “Wine snob is an affectionate term for people who understand and enjoy wine." This may be giving the real snobs a bit too much credibility, but it does emphasize the wide-ranging nature of the term. In particular, not all wine snobs have massive wealth, although a certain level of financial liquidity is obviously required. Snobbery on its own is usually relatively harmless, but combining it with increasing wealth is simply asking for increasing amounts of trouble.

Wine snobbery has been a topic of discussion for quite a while. For example, whole books on the topic have been around since the 1980s, varying from the humorous (The Official Guide to Wine Snobbery, by Leonard S. Bernstein, 1982) to the very serious (Wine Snobbery: an Insider's Guide to the Booze Business, by Andrew Barr, 1988).

Barr, in particular, describes how a large section of the drinks industry relies on snobbery for its profitability. Luxury wines cost an arm and a leg (see The cost of luxury wines), but they are not much better in quality than wines costing a tenth of the price (see Luxury wines and the relationship of quality to price). It takes snobbery and wealth to get involved in this segment of the refreshments business.


Fortunately for those of us who understand and enjoy wine, and therefore might conceivably be considered snobs, there is another segment of wine snobbery that requires expertise rather than wealth — knowing about little-known wines and regions requires time and effort, but not necessarily wealth. For example, few Americans know much about Australian wine, and yet Australia is a continent as well as a country, and it therefore has as wide a diversity of wine regions and wines as any other continent. Wine writers are often lazy, and they treat "Australia" as a single wine region, just as they do for any of the much smaller countries of South America or Europe, in spite of its greater vinous diversity than most of these other countries. You can get a lot of snob value out of knowing more about Australian wine than just shiraz! (Some examples: So much more than “just shiraz”! ; Why there's more to Australian wine than chardonnay ; Alternative Australian wines.)

Wine Cellar, Park Hotel

Old bottles of wine also provide snob value, of course, but they can often do this without much monetary expenditure. In Europe, old wine is available on eBay, but massive wealth is not usually to be found there — the wealthy shop elsewhere than eBay (or Amazon). Snobbery is available on eBay, like anywhere else, but it is not massive — there is little snob value to be gained from saying that you shop on eBay. But turning up to dinner with an old bottle of wine does not require that you tell anyone where you got it!

Consumer wine fraud has been detected involving some relatively inexpensive wines, as well as the more newsworthy expensive ones, and so caveat emptor always applies, on eBay as much as anywhere else. However, on eBay it is much more likely that an old bottle of wine will be undrinkable, rather than that it will be drinkable but not what the label says it is. Poor storage of old bottles is a far bigger risk than is a problematic pedigree. It is for this reason that reputable sellers on eBay emphasize that you are buying the bottle not its contents.

Perhaps that is a warning we should put on all old bottles, no matter what their price or provenance?
You are buying the snob value of the label, not the wine — pay accordingly, and don't complain.

Monday, 13 November 2017

CellarTracker wine scores are not impartial

Among other things, CellarTracker consists of a community-driven database of wines, along with comment notes on the tasting of those wines, and often also a quality score assigned at each tasting. Some time ago, Tom Cannavan commented:
The current thinking seems to be that the "wisdom of crowds" (CellarTracker, TripAdvisor, Amazon Reviews) is the most reliable way to judge something; but that thinking is deeply flawed. It's not just the anonymity of those making the judgements, and the fact that they may or may not have experience, independence, or have an agenda, but that crowd behaviour itself is far from impartial. We've all experienced people "tasting the label"; but there is also no doubt that members of the crowd allow their judgement to be skewed by others. That's why in big groups scores always converge toward the safe middle ground.
So, we can treat this as Proposition #1 against the potential impartiality of the CellarTracker wine-quality scores.

For Proposition #2, I have previously asked Are there biases in community wine-quality scores? In answering that question I showed that CellarTracker users have (for the eight wines I examined) the usual tendency to over-use quality scores of 90 at the expense of 89 scores.

For Proposition #3, Reddit user mdaquan has suggested:
Seems like the CellarTracker score is consistently lower than multiple professional reviewers on a consistent basis. I would think that the populus would trend higher, [and] not be as rigorous as "pro" reviewers. But consistently the CT scores are markedly lower than the pros.
So, we have three different suggestions for ways in which the wine-quality scores of the CellarTracker community might be biased. This means that it is about time that someone took a detailed look at the CellarTracker quality scores, to see how much bias is involved, if any.

The quality scores assigned by some (but not all) of the CellarTracker community are officially defined on the CellarTracker web site: 98-100 A+ Extraordinary; 94-97 A Outstanding; 90-93 A– Excellent; 86-89 B+ Very Good; 80-85 B Good; 70-79 C Below average: 0-69 D Avoid. However, the "wisdom of crowds" never follows any particular formal scheme, and therefore we can expect the users to each be doing their own thing.

But what does that "thing" look like when you pool all of the data together, to look at the community as a whole? This is a daunting question to answer, because (at the time of writing) CellarTracker boasts of having "7.1 million tasting notes (community and professional)". Not all of these notes have quality scores attached to them, but that is still a serious piece of Big Data (see The dangers of over-interpreting Big Data). So, I will look at a subset of the data, only.

This subset is from the study by Julian McAuley & Jure Leskovec (2013. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. Proceedings of the 22nd International Conference on the World Wide Web, pp. 897-908). This dataset contains the 2,025,995 review notes entered and made public by the CellarTracker users up until October 2012. I stripped out those notes without associated quality scores; and I then kept those notes where the wine was stated to have been tasted between 2 October 2002 and 14 October 2012. This left me with 1,569,655 public quality scores (and their associated tasting date), which covers the first 10 years of CellarTracker but not the most recent 5 years.

Time patterns in the number of scores

The obvious first view of this dataset it to look at the time-course of the review scores. The first graph shows how many public user quality scores are represented for each month of the study period.

Time-course of CellarTracker wine-quality scores 2002-2012

CellarTracker was designed in 2003 and released in 2004; therefore, all wines before that time have been retrospectively added. So, the graph's time-course represents recorded tasting time, not time of entry into the database, although the two are obviously related. The database reached its maximum number of monthly scored wines at the beginning of 2011, after which it remained steady. The dip at end of the graph is due to the absence of wines that were tasted before the cutoff date but had not yet been added to the database at that time.

The annual cycle of wine tasting is obvious from 2005 onwards — the peak of tasted wines is at the end of each year, with a distinct dip in the middle of the year. This presumably represents wine consumption during the different northern hemisphere seasons — wine drinking is an early winter thing.

The quality scores

The next graph show the frequency (vertically) of the wine-quality scores (horizontally). This should be a nice smooth distribution if the quality scores are impartial; any deviations might be due to any one of the three propositions described above. Although it is somewhat smooth, this distribution shows distinct peaks and dips.

CellarTracker wine-quality scores 2002-2012

For the lower scores, there are distinct peaks at scores of 50, 55, 60, 65, 70, 75, and 80. This is not unexpected, as wine tasters are unlikely to be interested in fine-scale differences in wine quality at this level, or even be able to detect them.

For the scores above 80, 57% of the scores are in the range 88-92. If we are expecting some sort of mathematically average score for wines, then these data make it clear that it is a score of 89-90. That is, the "average" quality of wine consumed by the CellarTracker community is represented by a score of c. 90, with wines assessed as being either better or worse than this.

However, a quality score of 90 shows a very large peak compared to a score of 89, exactly as discussed under Proposition #2 above. I have previously reported this fact for both professional (Biases in wine quality scores) and semi-professional (Are there biases in wine quality scores from semi-professionals?) wine commentators, as well as the CellarTracker community. So, there is apparently nothing unusual about this, although it could be seen as questioning the general utility of wine-quality scores. If subjective factors make people use 90 in preference to 89, then what exactly is the use of a score in the first place?

Moving on, we now need to look for other possible biases in the scores. In order to evaluate whether any of the scores are biased, we need an unbiased comparison. As I explained in my first post about Biases in wine quality scores, this comes from an "expected frequency distribution", also known as a probability distribution. As before, it seems to me that a Weibull distribution is suitable for wine-score data.

This Weibull expected distribution is compared directly with the observed frequency distribution in the next graph. In this graph, the blue bars represent the (possibly biased) scores from CellarTracker, and the maroon bars are the unbiased expectations (from the probability model). Those scores where the heights of the paired bars differ greatly are the ones where bias is being indicated.

Biases in CellarTracker wine-quality scores 2002-2012

This analysis shows that quality scores of 88, 89, and 90 are all over-represented, while scores of 93, 94, and 95 are under-represented, compared to the expectation. This indicates that the CellarTracker users are not giving as many high quality scores as expected, but are tending to give too many scores of 88-90, so that scores are skewed towards values below just 90 rather than just above.

This is exactly what was discussed under Proposition #3 above, where the professionals seem to give somewhat higher scores when the same wines are compared. Furthermore, it is in line with Proposition #1, as well, where the community scores simply converge on a middle ground — a CellarTracker score is more likely to be in the small range 88-90, rather than most other numbers.

Furthermore, quality scores of 81, 83, and 86 are also under-represented, according to the analysis. This creates a clustering of the lower scores at certain values. Presumably, the tasters are not bothering to make fine distinctions among wines below their favorite scores of 88-90.

Time patterns in the quality scores

We can now turn to to look at the time-course of the wine-quality scores. This next graph shows the average quality score for the wines tasted during each month of the study.

Average CellarTracker wine-quality scores 2002-2012

The average score was erratic until mid 2005, which is when the number of wines (with scores) reached 3,000 per month. So, that seems to be the number of wine scores required to reliably assess the community average.

From 2007 to 2009 inclusive, the average quality score was c. 88.5, although there was a clear annual cycle of variation. Notably, after 2009 the average quality score rose to >89. Does this represent the proverbial score inflation? Or perhaps it is simply the community maturing, and starting to give scores more in line with those of the professionals (which are higher)?

To try to assess this, the final graph shows the time-course of the proportion of scores of 95 or above. Many of the professional reviewers have been accused (quite rightly) of over-using these very high scores, compared to the situation 20 years ago, and so we can treat this as an indication of score inflation.

High CellarTracker wine-quality scores 2002-2012

This graph shows no post-2009 increase in the proportion of very high scores. So, the increase in the average CellarTracker quality score does not represent an increased usage of very high scores, but is instead a general tendency to assign higher scores than before. Or perhaps it represents the community drinking better wines?

Finally, it is worth pointing out the annual cycle in the average scores and in the proportion of very high scores. The annual peak in quality scores is in December. That is, wines get higher scores in December than at most other times of the year. I hope that this represents people buying better wines between All Hallows Day and New Year, rather than drinking too much wine and losing their sense of values!


All three predicted biases in the CellarTracker wine-quality scores are there! The community scores are generally lower than expected, they cluster in a smaller range around the average than expected, and a score of 90 is over-used compared to 89. There are also very distinct seasonal patterns, not only in the number of wines tasted but also in the scores assigned to them.

These conclusions are not necessarily unexpected. For example, Julian McAuley & Jure Leskovec (cited above) noted: "experienced users rate top products more generously than beginners, and bottom products more harshly." Furthermore, Omer Gokcekus, Miles Hewstone & Huseyin Cakal (2014. In vino veritas? Social influence on ‘private’ wine evaluations at a wine social networking site. American Association of Wine Economists Working Paper No. 153) have noted that community scores are normative (= "to conform with the positive expectations of another") rather than informational ("to accept information obtained from another as evidence about reality").

Monday, 6 November 2017

The dangers of over-interpreting Big Data (in the wine business)

In order to understand complex sets of information, we usually summarize them down into something much simpler. We extract what appear to be the most important bits of information, and try to interpret that summary. Only the simplest pieces of information can be left alone, and grasped on their own. This creates an inherent problem — data summaries also leave information out, and that information may actually be very important. Sadly, we may never find this out, because we left the information out of the summary.

Clearly, the biggest danger with what are known in the modern world as Big Data is that, in order to understand it, we first turn it into Small Data by ignoring most of it. That is, the bigger the dataset then the more extreme is the summary process, because of our desire to reduce the complexity. Data summaries tend to be all the same size, no matter how big the original dataset was. Unfortunately, most of the discussion about Big Data has involved only the technical aspects, along with the optimistic prospects for using the data, without much consideration for the obvious limitations of data summarizing.

One of the most common ways that we have historically used to summarize data is to organize the data into a few groups. We then focus on the groups, not on the original data. In this post, I will discuss this in the context of understanding wine-buying customers.


By summarizing data, we are looking for some sort of mathematical structure in the dataset. That is, we are looking for simple patterns, which might then mean something to us, preferably in some practical sense.

Putting the data into groups is one really obvious way to do this; and we have clearly been doing it for millenia. For example, we might group plants as those that are good to eat, those that are poisonous, those that are good as building material, etc.

The biggest limitation of this approach is that we can end up treating the groups as real, rather than a mathematical summary, and thus ignore the complexity of the original data. For example, groups can overlap — a plant can be both poisonous and good for making house walls, for example; and focusing on one group or the other can make us forget this.

Groups can also be fuzzy, which means that the boundaries between the groups are not always clear. Dog breeds are a classic example — pure-bred dogs clearly fit into very different groups, and we cannot mistake one breed for another. But dogs of mixed parentage do not fit neatly into any one group, although we often try to force them into one by emphasizing that they are mostly from one breed or another. That is, the breeds are treated as real groups, even though they overlap, and thus are not always distinct.

Examples of grouping

Let's consider two examples, one where the groups might make sense and one where they are more problematic.

When considering customers, one obvious grouping of people is gender, male versus female. In science, this is simply a genetic grouping (based on which genes you have), but elsewhere it is usually treated as also being a behavioral grouping. Businesses are therefore interested in what any gender-associated differences in behavior might mean for them.

Consider this example of using Twitter hashtags to quantify gender differences: The hard data behind how men and women drink. The data come from "half a million tweets collected over the course of a year (June 2014 - July 2015), with the gender detected from the first name of the tweeter." The first graph shows the frequency of 104 drink-related hashtags, arranged according to how often they originated from male versus female tweeters.

Note that no hashtags are used exclusively by either males or females — indeed, only two exceed 80% gender bias (homebrew, malt). Equally, no hashtags are used equally by males and females — the closest are: cachaca, patron, caipirinha. We thus might be tempted to recognize two groups, of 40 "female" words and 64 "male" words.

However, we have to be careful about simply confirming our starting point. We pre-defined two groups that represent observed differences (in genetics), and then we have demonstrated that there are other differences (in behavior). The data are essentially continuous, with some words having less than 47% vs. 53% gender distinction. In this case, gender still forms indistinct groups.

Moving on, this situation becomes even more complex when we start to consider situations with many possible groups, based simultaneously on lots of different characteristics. In an earlier post, I discussed the mathematical technique of using ordinations to summarize this type of data (Summarizing multi-dimensional wine data as graphs, Part 1: ordinations).

This next graph shows an example of the resulting data summary, called an ordination diagram. If each point represents a person, then the spatial proximity of the points would represent how similar they are. So, points close together are similar based on the measured characteristics, while points progressively further apart are progressively more different.

This ordination diagram does not contain any obvious groups of people — they are spread pretty much at random. However, that does not mean that we cannot put the people into groups! Consider this next version of the same diagram, in which the points are now colored. The five different colors represent five groups, one in each corner of the diagram and one in the center.

Clearly, these groups do not overlap. More to the point, the centers of each group are quite distinct. Thus, the groups do have meaning as a summary of the data — combining the descriptions of each group of people would create an easily interpreted summary of the whole dataset.

However, these are fuzzy groups — the boundaries are not distinct, and the groups of people are not discrete. Thus, I am also losing a lot of information, as I must in a summary of complex data; and I need to care about that lost information as well. I cannot treat the groups as being real — they are a convenience only. As a technical aside, it is worth noting that the groups are not an illusion — they are an abstraction.

The point of this blog post is to make it clear that this problem must especially be addressed when dealing with Big Data, because that is where techniques like ordination come into play.

Big Data

Wikipedia has this to say about Big Data:
Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them ... Lately, the term "big data" tends to refer to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data ... Analysis of data sets can find new correlations to spot business trends, prevent diseases, combat crime, and so on.
In business, the use of information from social media is the most obvious source of Big Data. People are often perceived as being much more honest in their online social interactions than they are in formal surveys; and so this relatively recent source of information could potentially be much more useful to modern business practices.

As this infographic indicates, the social media can generate some really big datasets. Making sense of these data involves some pretty serious summarizing of the data. Therefore, the principles that I have discussed above become particularly important — we have to be very careful about how we interpret those summaries, especially if we have summarized the data into groups.

An example from the world of wine

So, let's conclude with a real example from the world of wine buying: the 2016 Digital Wine Report: the Five Tribes of Online Wine Buyers, prepared by Paul Mabray, Ryan Flinn, James Jory and Louis Calli. (Thanks to Bob Henry for getting me a copy of the "Academic edition" of this report.)

This study was produced by a group originally called VinTank, and who at the time were a subsidiary of H2O (who subsequently closed them down!). The objective of the report was to combine data about wine drinkers, based on the social media, with data about wine buyers, based on online purchases. This is a perfect example of using Big Data to help businesses understand their customers.

The social data were for 12,500 individuals, based on 183,000 Twitter posts assessed by the TMRW Engine software. The buying data were for 53,000 online wine purchases, recorded by Vin65. So, the report attempts to summarize the wine behavior of people who use both social media to discuss wine and online shopping to purchase wine, in the USA. Clearly, this does not attempt to represent all US wine drinkers and buyers — the people summarized "buy directly from wineries, they are digitally savvy and use both e-commerce and social media, and they like wine more than the casual consumer."

The crux of the report's methodology is this:
Using a methodology built upon the foundations of demographic and psychographic market research techniques, we segmented [= grouped] online wine customers according to their psychographic profiles: including hobbies, preferences, activities, and political outlooks ... We were [then] able to apply this segmentation to purchasing behavior and demographic profile at the individual customer level. As a result we've identified 5 common "tribes" of online wine buyers.
To personalize these five tribes, we've given each one a name, a theme and a personality description.
You can immediately see what I am warning you about here — these five tribes are not real, even though they have names and distinct personalities. The psychographic and demographic characteristics of the people vary continuously, and grouping them is merely a convenient mechanism for data summary.

In order to get a sense of what these groups look like, refer to the colored version of the ordination diagram shown above, where the group centers are different but the boundaries are fuzzy. I have carefully analyzed the data presented in the report, and I can assure you that the five "tribes" really do have different behavioral "centers"; but I would hate to have to assign anyone to one group or another. At a personal level, I can't see myself as being in any of these five tribes.

Part of the problem here is that categorizing people in this manner simply perpetuates cultural stereotypes. In this case we have: Anna, the sophistocrat; Graham, the info geek; Sofia (or Sophia), the digital native; Don, the southern conservative; and Kevin, the trophy hunter. If none of these people sounds like you, then you are probably right.


Big Data are useful, there is no doubt about it. However, big data can potentially have big problems, as well, and we need to guard against the consequences of this. One of the most common ways to summarize Big Data is to assign the study objects to groups, but these groups are not real — they are a conceptual convenience, nothing more. Hopefully, grouping their customers will help businesses provide services to those customers, but that does not mean that the businesses should ignore those people who do not fit neatly into any of their groups.

Monday, 30 October 2017

The anti-social media arrives on the wine-blogging scene!

I have previously been subject to professional plagiarism, as I have described on my other blog:
          Unacknowledged re-use of intellectual property

However, I was not expecting to have my wine blog plagiarized. Sadly, this has now happened, too. There is a Blogspot blog, called "artsforhealthmmu" that, as of today, consists entirely of copies of 19 of my posts from the Wine Gourd, without an iota of reference to me or my blog. Needless to say, I am not going to put a link to that blog here, but you will find it if you search for the name on the web.

From the University Libraries of the University of Tennessee Knoxville.

It is not immediately obvious what the game is, here. That other blog is well designed, and there is currently no advertising or rogue links that I can find, although that may change. Interestingly, there is a genuine site called the "Arts and health blog" that has almost exactly the same web address as the plagiarizing blog. (The latter has an extra "s" in the address.) So, there are other people who seem to be as much a victim of this situation as I am.

I have submitted a request to Blogger to have the offending blog removed, but that will take some time to implement. I am hoping that this will result in a better resolution than happened the last time I was plagiarized (mentioned above), where the offending (well-known) author had no real excuse or apology, and the (well-known) book publisher metaphorically just shrugged his shoulders. A pox on all of them!

Update 4 Nov.:

Notice from Google:
"In accordance with the Digital Millennium Copyright Act, we have completed processing your infringement notice. We are in the process of disabling access to the content in question"

Monday, 23 October 2017

Ranking America’s most popular cheap wines

In a recent article in the Washington Post, Dave McIntyre contemplated 29 of America’s favorite cheap wines, ranked. Here, he looked at some of the mass-market wines available in the USA, and tried to decide which ones might be recommendable by a serious wine writer.

To do this, he "assembled a group of tasters to sample 29 chardonnays, cabernets and sweet red blends that are among the nation’s most popular, plus a few of my favorite and widely available Chilean reds". He then ranked the wines in descending order of preference (reds and whites separately), and provided a few tasting notes.

This is a risky business, not least because the tasting notes tended to be along the lines of "smells of sewer gas" and "boiled potato skins, sauced with rendered cough drops", which might suggest that the exercise was not taken too seriously by the tasting panel. However, a more important point is that the general populace of wine drinkers might not actually agree with the panel.

Scores and ranking of chardonnay wines

The latter point can be evaluated rather easily, by looking at any of those web sites that provide facilities for community comments about wine. I chose CellarTracker for this purpose. I looked up each of the 29 wines, and found all but three of them, the missing ones being some of the bag-in-box wines. For each of the 26 wines, I simply averaged the CellarTracker quality scores for the previous five vintages for which there were scores available (in most cases there were >50 such scores).

I have plotted the results in the two graphs above, where each point represents one of the wines. McIntyre's preference ranking is plotted horizontally, and the average CellarTracker scores are shown vertically. McIntyre's Recommended wines are in green, and the Not Recommended wines are in red.

It seems to me that there is not much relationship between the McIntyre ranking and the CellarTracker users' rankings. In particular, there is no consistent difference in CellarTracker scores between those wines that McIntyre recommends and those that he rejects. In other words, the preferences of the populace and the preferences of the tasting panel have little in common.

So, what exactly was the point of the Washington Post exercise? It may be a laudable exercise for wine writers to look at those wines that people actually drink, rather than those drunk by experts (eg. Blake Gray describes it as "outstanding service journalism"). However, we already have CellarTracker to tell us what actual wine drinkers think about the wines; and we have a bunch of specialist web sites that taste and recommend a wide range of inexpensive wines (see my post on Finding inexpensive wines). These sources can be used any time we want; we don't need a bunch of sardonic tasting notes from professionals, as well.

Personally, I would go with the CellarTracker scores, as a source of Recommended wines.

Monday, 16 October 2017

What has happened to the 1983 Port vintage?

One of the most comprehensive sites about vintage port is This site was established in 2011, by three port enthusiasts from southern Sweden. It contains details about every port brand name, and its history (with a few rare exceptions). It also contains tasting notes about nearly every vintage wine produced under those brand names, often tasted more than once — this is well over a thousand tasting notes.

Many of these ports were tasted at meetings of The Danish Port Wine Club (in Copenhagen) or The Wine Society 18% (in Malmö). Among these, there have been a few Great Tastings, during which at least 20 port brands were tasted from a particular vintage. Earlier this year it was the turn of the 1983 vintage.

The 1983 vintage was highly rated in the mid 1980s, and 36 of the houses / brands released a vintage port (ie. almost all producers declared the vintage). A survey of the vintage scores from various commentators reveals this:
Rating out of 100
Tom Stevenson
Wine Advocate
Wine Spectator
Cellar Notes
Into Wine
MacArthur Beverages
Rating out of 10
Berry Bros & Rudd
Oz Clarke
Vintages (LCBO)
Wine Society
Passionate About Port
Rating out of 5
Michael Broadbent
For the Love of Port
1983 vintage


Best recent vintage


So, almost all of the commentators rated the vintage as 90+, but did not rate it as among the very best of the recent Port vintages. The site notes that they also previously rated the 1983 vintage as Outstanding (a score of 18 / 20 = 95).

However, earlier this year, at their Great Tasting, the three port lovers re-tasted 31 of the 36 vintage wines from 1983. They were rather disappointed: "Many wines were defective with volatile acidity, and many wines were not what we had expected ... We have now changed this vintage to the rating Very Good" (score 15 / 20 = 88). This is quite a come down.

Some of the 1983 vintage ports

It is now 30 years since the 1983 ports were put in their bottles. This is not usually considered to be an especially long time in the life of a vintage port, although most of this vintage has probably been consumed by now. So, what has happened to these wines? It seems unlikely that bottle variation is responsible for the poor results. Maybe the corks in use at the time were not up to he job? Or, maybe the grapes just weren't as good as people thought at the time.

The three port enthusiasts were in general agreement with each other about the scores of the 31 individual wines (although Sten and Stefan agreed more often than Jörgen), so that their group ratings were consistent.

However, it might be better if we base our overall assessment of the wines on the average score from all 16 of the participants in the Great Tasting. They rated 2 of the ports as Excellent (17 points / 20), 4 as Very fine (16 points), 17 as Very good (score 15), 4 as Good (score 14), 1 as Average (score 13), and 3 as Below average. The two best ports were from Quarles Harris and Gould Campbell, while the three worst were from Real Vinicola, Dow's and Kopke.

You can check out the full results on their site. Overall, this is quite an impressive source of information for port aficionados.

Monday, 9 October 2017

Grape harvest dates and the evidence for global warming

I mentioned in my previous post (Statistical variance and global warming) that in Europe there are long-term records of the starting dates for grape harvests, and that this can be used to study changing weather patterns. This is because grape harvest dates are highly correlated with temperature — the warmer the season then the earlier will be the grape harvest. This fact has been much in the news this year, with very hot summer temperatures followed by very early grape harvests in many northern-hemisphere regions.

Written records of harvest dates exist in western Europe because the harvest dates are usually officially decreed, based on the ripeness of the grapes. The grapes are used for wine-making, and this activity has traditionally been under some sort of official control. Thus, we have historical records for many locations over many years.

I have previously shown two long-term datasets for wine-growing regions, one for Two centuries of Bordeaux vintages and one for Three centuries of Rheingau vintages. These both show very large changes in the timing of the start of the grape harvests, especially in recent decades. In this post I will look at some more data.

Map of regions with paleoclimatology data

Much of this data has been compiled into a publicly accessible database archived at the World Data Center for Paleoclimatology (see V. Daux, I. Garcia de Cortazar-Atauri, P. Yiou, I. Chuine, E. Garnier, E. Le Roy Ladurie, O. Mestre, J. Tardaguila. 2012. An open-database of grape harvest dates for climate research: data description and quality assessment. Climate of the Past 8:1403-1418). This database comprises time series for 380 locations (see the map above), mainly from France (93% of the data) as well as from Germany, Switzerland, Italy, Spain and Luxembourg. The series have variable lengths up to 479 years, with the oldest harvest date on record being for 1354 CE in Burgundy.


I have taken the harvest-start data for Burgundy and supplemented it with data from another study (I. Chuine, P. Yiou, N. Viovy, B. Seguin, V. Daux, E. Le Roy Ladurie. 2004. Grape ripening as a past climate indicator. Nature 432:289-290). I have graphed the data below, which shows a complete record of the official start of the grape harvest for every year from 1370 to 2006 CE, inclusive.

The harvest dates are shown relative to the beginning of September (day 0); and the pink line shows the 9-year running average.

Harvest start dates for Burgundy 1370-2006

This graph shows some very interesting patterns. First, in spite of the ups and downs in the graph, there is no long-term change — harvest starts have pretty much remained within a 4-week period after September 10.

However, the long-term pattern does show two long cycles, with harvest dates getting progressively later through time and then moving earlier again — the first cycle was from 1370 to 1700, and the second from 1700 to now. Super-imposed on these two long cycles, there were smaller 20-year cycles before 1700, and 30-year cycles after that time. For mathematicians, this might be an interesting dataset on which to perform some Fourier time-series analysis.

For our purposes here, there has been a dramatic change in harvest date in recent years, with the earlier and earlier harvests since 1984 being attributed to global warming. However, there was just as rapid a change in harvest dates from 1420 to 1450, although at that time the harvests became rapidly later, due to cooling of the weather (not warming).

This graph thus illustrates what the climate-change skeptics are on about. There have been recordings of previous large changes in the weather, which have affected European agriculture. In that sense, the current change in the weather is not necessarily unusual. The skeptics suggest that we should continue to "suck it and see", to find out whether the weather turns around and becomes cooler again. However, there have been no longer trends of change, and, unlike for the previous longest occasion, there is no current indication that our recent warming trend will reverse itself.


By way of contrast, we could also look at some shorter harvest trends from elsewhere in the world. The data I have chosen come from Australia (L.B. Webb, P.H. Whetton, E.W.R. Barlow. 2011. Observed trends in winegrape maturity in Australia. Global Change Biology 17:2707-2719).

The longest grape-maturity record presented by these authors is 115 years, for the McLaren Vale region, in South Australia. Unfortunately, the data for 1992-1997, inclusive, are missing, which reduces its utility for studying global warming.

So, instead I will show the graphs for two shorter time-series from central Victoria, one for Shiraz grapes (red) and one for Marsanne (white). Note that the harvest in Australia is in March, not September. These graphs cover 70 years; and the pink lines show the 5-year running average.


The graphs both show relatively short-term cycles in grape maturity, superimposed on longer-term cycles, the same as I noted above for Burgundy. For example, for the Marsanne grapes the shorter cycles seems to be c. 20 years long. Maybe this is a common pattern for wine grapes?

In any case, the move towards earlier harvests is just as obvious in these data as it is in the data for the Burgundy region (and also for Bordeaux and the Rheingau, as shown in my earlier posts). The recent change in agriculture patterns truly is global.

The skeptics

Sadly, Australia is one of the official political homes of climate-change denial. For instance, take this comment by renowned Australian government viticulturist John Gladstones, which is from Wine, Terroir and Climate Change (Wakefield Press, 2011):
How much warming, then, can justly be attributed to anthropogenic greenhouse gases? Taking all evidence into account, the proven amount is: none ... from a viticultural viewpoint we can conclude that any anthropogenic changes to mean temperatures will be small and, for some decades to come, unlikely to have major effects beyond those of natural climate variability.
And yet, here we are, several years later, and we have already reached the limit of climate variability that we have recorded for the past 6+ centuries. How much longer are we expected to suck it and see?

The key word in the climate debate is "prove". We cannot, in the strict sense, ever prove a causal connection between anthropogenic activities and climate change. But, by the same token, we can never prove that the sun will rise tomorrow morning, either. Both cases involve forecasts about the future, and we will only be able to evaluate them in hindsight. By then, of course, it is usually too late, if something has gone amiss.

Monday, 2 October 2017

Statistical variance and global warming

The idea of global warming is a matter of meteorological record, not personal opinion. For the past quarter-century, the world's weather has been very different to the preceding half century, and this has been noted by weather bureaus around the globe.

For example, the town where I live, Uppsala, has one of the longest continuous weather records in the world, starting in 1722. The recording has been carried out by Uppsala University, and the averaged data are available from its Institutionen för geovetenskaper. This graph shows the variation in average yearly temperature during those recordings, as calculated by the Swedish weather bureau (SMHI — Sveriges meteorologiska och hydrologiska institut) — red represents an above-average year and blue below-average. As you can see, below-average years have not been recorded since 1988, which is the longest run of red on record.

Uppsala temperatures over three centuries

The consequences of this weather change have been particularly noted in agriculture, because the growth of plants is very much dependent on sunshine and rainfall — a change in either characteristic will almost certainly lead to changes in harvest quantity and quality, as well as harvest timing. (See Climate change: field reports from leading winemakers. 2016. Journal of Wine Economics 11: 5-47.)

Grape harvests

Grape harvests have been of particular interest for economic reasons, given the importance of the wine industry to many countries. However, they have also been of interest because there are many long-term harvest records for Europe, and so the changing of the harvests in response to weather conditions over several centuries has been recorded, and can be studied. I have discussed this in a post on my other blog — Grape harvest dates as proxies for global warming; and this may be of interest to you, so check it out.

I conclude that we should be in no doubt that the recent change in the weather has had a big effect on wine production, and that we can reasonably expect that this will continue while ever the current weather patterns continue.

What seems to be more contentious, however, is assessing the causes of these weather patterns, and how the people of this planet might respond, if at all. For example, Steve Heimoff, over at the Fermenattion blog, has recently discussed this issue (Inconvenient timing for a climate-change heretic).

One of the important issues here is the concept of statistical variance; and this is what I will discuss in the rest of this post.

Statistical variance

"Statistical variance" refers to the variation that occurs due to random processes and stochastic events. For example, we do not expect the average yearly temperature to be the same from year to year, which is what would happen if there was no statistical variance. Instead, we have observed that each year is somewhat different, with some years being above average and some below. In the graph above, the years varied from 8°C below the long-term average to 8°C above the long-term average — these numbers quantify the amount of statistical variance that has occurred in Uppsala's weather over the past three centuries.

We also do not expect regular patterns in the statistical variance. For instance, we should be very surprised if the years always alternated between above-average and below-average temperatures. Instead, we expect runs of several years above or below, without any necessary pattern to how long those runs will be. This is precisely what is shown in the graph above, where there are runs of anywhere from 1 to 9 consecutive years with similar weather.


Human beings need to understand this concept of statistical variance in order to work out whether anything unusual is happening around them. For example, a business person needs to work out whether a run of several months of poor economic performance is simply statistical variance, or whether it indicates that something has gone wrong (and needs to be corrected). Alternatively, a run of several months of good economic performance may also be simply statistical variance, and not at all an indication that the company is being well run!

This seems to be a very hard thing for people to grasp. Runs of events, whether good or bad, are often interpreted as being non-random; and runs of apparent bad luck can be very depressing, while runs of good luck can lead to over-confidence (which is well known to come before a fall).

This topic has been discussed in a number of books. One of the better known of these is Leonard Mlodinow's 2008 book The Drunkard's Walk: How Randomness Rules Our Lives. This has a detailed discussion of "the role of randomness in everyday events, and the cognitive biases that lead people to misinterpret random events and stochastic processes." His particular message is that people who don't grasp the idea of statistical variance can be led astray by randomness — a run of bad luck does not necessarily make you a failure, nor does a run of good luck necessarily make you a success. You can watch a video presentation by him on Youtube (thanks to Bob Henry for alerting me to this).

Why we should address statistical variance

Like Mlodinow, I am particularly interested in how people respond to statistical variation.

In this context I will mention a simple example, called the Gambler's Fallacy, which is very relevant. Gambler's often think that in a game of 50:50 chance they will break even in the long term, because they will eventually win and lose the same amounts of money. However, mathematicians have shown that, due to statistical variance, this can only be guaranteed in practice if the gambler has the resources to allow for infinite gains and infinite losses (ie. they can sustain an infinitely long winning or losing streak). Such long runs of wins and losses do not result from expertise or lack of it — they will happen anyway, just by random chance. However, in practice, the gambler will stop playing when their bankroll reaches zero (from too many consecutive losses) or when they bust the casino (from too many consecutive wins). So, there is no way to guarantee to break even in the long term — either you or the casino may go bust before that happens.

The importance of this example is that it emphasizes how we deal with statistical variance, in practice. For example, in practice it does not really matter whether current climate change is a permanent change (perhaps caused by modern industrial societies) or the result of statistical variance. It will affect us either way — metaphorically, it is just as possible for either we or the casino can go bust due to statistical variance, as going bust from any other possible causes. The practical question is: what are we going to do about it? We are sentient beings (that's what our scientific name Homo sapiens means), and we thus have the ability to recognize what is happening, and to potentially do something about it. We need to decide whether we want to do something, or not.

Several decades ago, when it was pointed out that there was a hole in the ozone layer, a possible cause was identified (use of CFCs), and a potential response was outlined (stop using CFCs, because there are alternatives). We decided to respond, globally; and the latest reports show that the hole is now shrinking. Maybe the increase and decrease in the hole are simply the result of statistical variance; but maybe we are actually smarter than the skeptics think. We keep records (we describe the world), we think about the patterns observed in those records (we explain the world), and we work out how we might respond (we try to forecast the future).

However, being sentient doesn't necessarily make us intelligent. Some people are skeptics because that is how they are built; others are skeptics because they have their own agenda (often to do with them making money, and lots of it), and to hell with everything else. Intelligence requires more than skepticism; and this applies to global warming as much as anything else.

So, the skeptics are right when they point out that rapid changes in long-term weather have occurred before in our recorded history; and I will discuss the data in a future post. However, this fact is irrelevant. Our response cannot be determined by these past patterns, because the current effects are occurring now, irrespective of whether they also occurred back then. Furthermore, even if any particular climate change is "natural" doesn't mean that we will be unaffected. We are going to look like complete fools if we (metaphorically) go bankrupt while attributing it to statistical variance. This is like leaping into a deep hole while yelling "look at me, I'm falling!"

Common ways to deal with statistical variance

By way of finishing, I will mention a couple of ways that people have developed to address the effects of statistical variance. You might like to think about whether any of these can be applied to global warming.

The basic idea is to reduce the statistical variance. That is, we try to prevent long runs of positive and negative changes from happening — we reduce the extent of both the upswings and the downswings. Sadly, there is no known way to reduce the negative changes (runs of bad luck) without also reducing the positive changes (runs of good luck).

In economics and gambling, one way to do this is by hedge bets. This involves investing most of our money in one way while simultaneously investing a smaller amount in the opposite way. So, we might bet most of our money on one particular team winning the game while also placing a smaller bet on the other team. This will reduce or losses if we have put most of our money on the losing team (because we will still win the smaller bet), although it will also reduce our winnings if we have put most of our money on the winning team (because we will still lose the smaller bet). So, hedge betting reduces our possible wins and losses — that is, it reduces the statistical variance. In economics, so-called Hedge Funds operate in precisely this manner; and they seem to be quite successful financially.

A somewhat different approach is taken in card games like poker. Professional online poker players usually play hands at multiple tables simultaneously. That is, they are placing multiple bets at the same time. Each table is potentially subject to wide statistical variance, but the average across all of the tables will usually have much less variance. Across any one betting session, the losses at one table will be counter-balanced by the wins at other tables, thus reducing the statistical variance for the poker player. This is an important component of being a professional in any field — the effect of random processes (good luck and bad luck) needs to be minimized.

It might strike you as a bit odd that I am talking about gambling in terms of dealing with statistical variance, but the same principles apply to all circumstances. In practice, a poker player betting at multiple tables is mathematically no different from an insurance company having lots of policy holders — you win some and you lose some, but you will reduce the extremes of winning and losing by being involved in multiple events. Most of our understanding of the mathematics of probability has come from studying both gambling and insurance.

Mind you, often the optimal strategy in poker is to go all-in with a good hand, which means that you will immediately either double your money or go bankrupt. This is not a recommended strategy when dealing with the world as a whole!

Monday, 25 September 2017

Did California wine-tasters agree with the results of the Judgment of Paris?

The short answer is: "No".

In May 1976, Steven Spurrier and Patricia Gallagher organized a wine tasting that has become known as the Judgment of Paris. Here, wines from France were tasted alongside some wines from California, and the latter acquitted themselves very well in the opinions of the tasters.

These sorts of comparative tastings had been conducted before, but mostly within the USA, whereas the Judgment took place in France itself with French judges; and, more importantly, it occurred in conjunction with the US Bicentennial celebrations. It therefore attracted much more media attention than any of the previous tastings. Indeed, it may well be the third most important event in the social and economic history of wine in the USA, after the imposition and then repeal of Prohibition.

However, the results of the Judgment were very variable among the tasters. Hardly any of them agreed closely with each other about the quality scoring of the wines, and especially about which wines were the best among the 10 reds (bordeaux grapes) and the 10 whites (chardonnays). This raises the question as to what other people thought about the relative quality of those same wines, at that same time.

This question is answerable to some extent by looking at the tastings of the Vintners Club, based in San Francisco. This club was formed in 1971 to organize weekly wine tastings (usually 12 wines). This club is still extant, although tastings are now monthly, instead of weekly. For our purposes here, the early tastings are reported in the book Vintners Club: Fourteen Years of Wine Tastings 1973-1987 (edited by Mary-Ellen McNeil-Draper. 1988).

One of the Club's tastings was an attempt to to evaluate the results of the Judgment tasting nearly 2 years afterwards (January 1978), as reported in a previous blog post (Was the Judgment of Paris repeatable?). However, the Club also tasted the individual wines before May 1976, usually in comparisons including other California wines (ie. those not chosen by Spurrier for the Judgment). Indeed, the success of the wines at these tastings seems to have played some part in establishing their respective reputations, leading to Spurrier choosing them to take part in the Judgment.

So, we can compare the Judgment wines quite independently of the Judgment itself, but in the same time period. This is an interesting exercise; and it emphasizes the point made at the time by Frank J. Prial (New York Times June 16 1976, p. 39), about the variability of wine assessments: "One would be foolish to take Mr Spurrier's little tasting as definitive."


Here, I will focus on the six California cabernets and six California chardonnays, each of which was tasted at the Vintners Club at least once before and once after the Judgment. All four of the French Bordeaux reds were also tasted at least twice at the Club, but not always both before and after the Judgment; and only one of the four French Burgundy whites was ever tasted at the Club.

Immediately preceding the Judgment, four of the six California chardonnays were tasted at the Chardonnay Taste-Off (February 1976), and four of the six California cabernets were tasted at the Cabernet Sauvignon Taste-Off (March 1976). These comparative Taste-Offs at the Vintners Club are explained in an earlier post (Wine tastings: should we assess wines by quality points or rank order of preference?).

The results of the various tastings are shown in the two graphs. The scores for each wine are the average from those tasters present on each occasion, based on the standard UC Davis scoring system, as used by the Vintners Club. The dates of the tastings are shown relative to the Judgment of Paris (May 24 1976).

In order to facilitate comparisons, the wines are listed in the graph legends in the order of their results at the Judgment itself (ie. highest score to lowest).

Vintners Club tastings of the Judgment cabernets

Among the cabernets, the Stag's Leap, Ridge, and Heitz wines were pretty much equal at every tasting. These wines were consistently rated as superior to the other three red wines, but we should not see any one of these three as being better than the other two. Interestingly, the only occasions on which the Stag's Leap wine was judged to be "best" was at the Judgment itself and again at its re-enactment. Also, note that the results for the Mayacamas wine were rather erratic, especially given its third-place result at the Judgment.

Vintners Club tastings of the Judgment chardonnays

Among the chardonnays, the Chateau Montelena wine did not do well. Indeed, the Chalone wine consistently scored better than the Montelena, except at the Judgment of Paris itself. Indeed, most of the white wines scored better than than the Montelena except at the Judgment and its re-enactment. Also, for the whites it was the David Bruce wine that received particularly erratic results, on at least one occasion performing very well.

Finally, it is worth noting that 9 of the 12 wines received much higher scores at the 1978 re-enactment of the Judgment than they had before the Judgment tasting. It is hard not to see a subjective post-hoc bias in this result.


Clearly, the unique accolades heaped on the Judgment's two "winning" wines were not justified by the Vintners Club comparisons of the 12 wines. The Montelena white, in particular, was usually bested by the Chalone wine; and the Stag's Leap red was never better than those from Ridge or Heitz. This emphasizes the unreliability of single tastings for assessing wines — the outcomes depend too strongly on the circumstances, particularly the tasters present. Furthermore, some wines obviously received very variable assessments, sometimes being rated much more highly than on other occasions — either these wines were showing bottle variation or they were in an unusual style (as has been noted for the Bruce wine).

Monday, 18 September 2017

Getting the question right

I have written quite a few posts in which I analyzed a dataset using some particular mathematical model. Obviously, the model chosen is of some importance here — different models might give different outcomes (although, hopefully not). However, the choice of model is actually determined by the original question being asked of the data — we need to match the question and the appropriate model.

This raises the important issue of getting the question right. This is especially true if we are trying to relate causes and effects. For example: is the causal factor the presence of some something, or the absence of something else? Sherlock Holmes is famous for drawing Inspector Lestrade's attention to "the curious incident of the dog in the night-time." It turned out that the important thing was that the dog did nothing, under circumstances when a guard dog should clearly have done something. Holmes solved the crime by asking a question about an absence, not a presence.

As an example from the wine world, consider the following graph. It shows the recent time-course of the percentage each of five countries has had of the global wine export market. The data are taken from Kym Anderson & Nanda R. Aryal (2015) Growth and Cycles in Australia’s Wine Industry: a Statistical Compendium, 1843 to 2013, with additions listed by the AAWE.

Global export percentages for the top five countries

We could ask any number of questions about these data. For example, we could ask about the general increase across the five countries since 1990, and whether it can be sustained. However, the most obvious question is likely to be about the time-course pattern for Australia, which seems to be dramatically different to the other four countries. But should that question be about the sudden increase that occurs from 2000 onwards, or the sudden decrease that occurs after 2005? Which pattern do we try to explain?

The second question (which seems to be the one that the Australian wine industry has been asking) would ask about why the "good times" suddenly crashed in 2005, and what the industry might do about it. On the other hand, the first question might ask about why the increase occurred in the first place, assuming that the subsequent decrease is simply a "return to normal" after a short-term aberration.

Let's look at how we might analyze Question 1. This next graph shows the Australian data compared to the average time-course of the other four wine-exporting countries (ie. excluding Australia).

The red line shows a very straightforward increase in export percentage through time. We might treat this line as a possible model of the "expected" pattern of growth, and then try to explain why the pattern for Australia does not fit in with it. This would be one way of answering Question 1. What we would do would be to apply a mathematical model to the red line, and then see how that model compares to the Australian data.

The next graph shows the fit of a simple Polynomial model to the average data, as indicated by the red dashed line. This model fits the data extremely well, as it accounts for 98% of the variation in the Average data.

We can, of course, now use this model to explore possible forecasts for future export growth. For example, the model forecasts that the Average export percentage will peak at 4.2%, which will occur in c. 2024. This might be a reasonable goal for an exporting country, to capture 4-5% of the market, and to consider themselves to have done well if they exceed this level.

More to the point, we can compare this model to the Australia data, as shown in the next graph. The blue dashed line is simply the red dashed line raised by 1.2 percentage points (which is the best fit to the Australia data). This reveals that from 2013 onwards the Australian exports were exactly where we would forecast them to be, based on the 1990-1995 data.

So, answering Question 1 would quite a reasonable way to tackle these data — the data do support the idea that the decrease in Australian export percentage may well be simply a return to "normal" after a short-term aberration. The downwards trend can be seen, not as a crisis, but merely as a correction. These are two quite different interpretations.

Getting the question right is crucial. Data analysis often suffers from what is called confirmation bias, in which we simply try to confirm the assumed answer to our initial question. That is, we look for what the dog did in the night-time, instead of looking for what it did not do — and we often find something that the dog did, no matter how irrelevant it may be!

Monday, 11 September 2017

Why lionize winemakers but not viticulturists?

It is widely noted that viticulturists can have as much influence on the quality of the final wine as do winemakers, and yet it is still the winemakers whose names are most widely known, because they are the ones who most commonly appear in the wine press. So, the people in the winery get the media attention more than those in the vineyard, even though the location of that vineyard is acknowledged to be of prime importance.

To counteract this trend, in this post I discuss one example, from Australia, where the viticulturist often gets almost as much press as the winemaker.

Wynns Coonawarra Estate is by far the biggest winery in the Coonawarra region of Australia, a region that has an international reputation for the quality of its cabernet sauvignon wines (although the shiraz wines are not too shabby, either). Wynns consistently project three people as being their "team", as listed in the first photo below. [Note: Ben Harris, the Vineyard Manager, tends to go missing from most of the press; see the photo at the bottom of the post.]

What is more important for our purposes here, the media actively go along with Wynns' attitude. I have listed a few press reports at the end of this post, as a small sample of what the wine media have to say. The two people titled "winemaker" do get more press than the viticulturist, although much of their personal press does tend to emphasize them as females in a male-dominated profession. Indeed, Sue Hodder and Sarah Pidgeon were jointly named the 'Winemaker of the Year' at the 2016 Australian Society of Viticulture and Oenology (ASVO) Awards for Excellence.

However, back in 2010, when Sue Hodder was named Australian Gourmet Traveller WINE’s 'Winemaker of the Year', a new award was introduced for Allen Jenkins: 'Viticulturist of the Year'. Part of the reason for acknowledging the importance of the viticulturist at Wynns has been his role in rejuvenating the vineyards over the past 15 years, and the clear effect that this has had on the quality of the wines.

L to R:  Sarah Pidgeon (Winemaker)
Sue Hodder (Chief Winemaker)
Allen Jenkins (Regional Vineyard Manager)

The rejuvenation program

Sue Hodder joined Wynns just prior to the 1993 vintage; and she was then appointed Chief Winemaker in 1998, at which point Sarah Pidgeon became Winemaker. Allen Jenkins arrived as the viticulturist in 2001-2, at least partly because Hodder and Pidgeon had realized that the vineyards needed extensive treatment, if the wines were to be improved.

For example, during the 1990s it was noted that the vines were building up too much dead wood, as a result of 20 years of (minimal) mechanical pruning. Indeed, the vines were reported to be so low yielding that they were hard to pick. The rejuvenation started in 2000, and was accelerated in 2002. It was expected to take eight years to complete; and the change in the wines was reported widely in the media starting from 2010.

The process involved large-scale vine regeneration by heavy chainsaw pruning of very old vines (shiraz up to 120 years old, cabernet sauvignon up to 60 years old), removing the dense clusters of dead wood, and thus bringing the vines back to a new physiological balance. Tired or diseased vines were grubbed out, along with the removal of lesser varieties. These were all replaced by new clones and rootstocks of cabernet and shiraz, for which the winery developed a heritage nursery, based on cuttings from time-proven vines. There was re-trellising, along with changed canopy management and new pruning techniques. The vineyards were also converted from sprinkler to drip irrigation.

Along with all of this, the winery was also modified to focus more on small-batch vinification, from 2008 onwards. This allows the grapes to be picked at perfect physiological ripeness, as even a large vineyard block can now be processed in many small batches instead of a few large ones. This takes advantage of the increased grape quality in the vineyard. The oak maturation of the wines has also been re-visited, resulting in a lighter handling, which now produces softer, more elegant wines. Indeed, the latter approach is a return to the style from the 1960s, rather than the heavier style favored in the 1980s and 1990s.

The flagship Wynns wines are the John Riddoch Cabernet Sauvignon and the Michael Shiraz, which are made only in years when grapes of very high quality are available. Production was stopped on both of these wines during the 2000-2002 part of the rejuvenation period. So, to see the effects of the rejuvenation on the quality of the Wynns wines, we need to look at a different product from the winery.

Black Label Cabernet Sauvignon

Within the Wynns range, the Black Label Cabernet Sauvignon holds a special place, even though it is marketed as the "basic" wine from the winery, with an average annual production of roughly 40,000 cases. The wine is currently blended from about 20 different small parcels of grapes, out of up to 80 that are contenders each year. The vines were planted mainly in the 1960s, 70s and 80s.

The flagship John Riddoch cabernet is always denser, more powerful and oaky than the cheaper Black Label, but the latter is always better value for money, selling for less than one third of the top wine's price (and often being aggressively discounted by retailers). Indeed, it has been repeatedly shown that the Black Label can age for decades, making it "possibly the most important cellaring wine in Australia", and forming "the backbone of many Australian cellars for over 50 years". This makes it "one of the most important wines in Australia’s wine history". Myself, I think that it is the best value-for-money cabernet wine that Australia produces.

There have been a number of retrospective tastings of this wine organized by Wynns, which go all the way back to the first vintage, in 1954. For example, there was an important vertical tasting covering the 50 years from 1954- 2004, which Hodder has described as the catalyst for the winemakers changing the style away from the heavier style of the 1990s. There was also a 60-year vertical tasting earlier this year.

Average Wine-Searcher scores for Wynns Black Label Cabernet

However, the published reports from these tastings are somewhat sporadic. So, for an evaluation of the effects of the vineyard rejuvenation it will be simpler to cover a shorter period. The graph above shows the weighted average scores from the Wine-Searcher database, covering the vintages from 1990 (ten years before the rejuvenation started) to 2014, inclusive.

Note that for almost every year since 2004 the wine has been scored 91 or higher, whereas before that 91 was the rare top score. There is no doubt that the wine is in the best form it’s been in for years. And the viticultural team can take most of the credit.

Buy yourself a bottle. Put it away for ten years. Then drink it. You will see what I mean about value for money.


Who says New World wines don't develop? — Michael Apstein

Gourmet Traveller | Viticulturist of the year — Susanne Bell

Who dares Wynns — James Halliday

Wynns Coonawarra: a revolution many years in the making — James Halliday

Wynns wine legend turns 60 — Huon Hooke

A 17-year winemaking partnership — Cathy Howard

Interview with Sue Hodder —Jeannie Cho Lee

Wynns unleashes Coonawarra’s diversity — Chris Shanahan

Wynns Coonawarra — great winemaking but the marketing sucks — Chris Shanahan

How Sue Hodder’s history lesson improved Wynns’ Coonawarra reds — Chris Shanahan

Profile: Sue Hodder — Tyson Stelzer