Monday, November 30, 2020

Are CellarTracker wine scores meaningful?

I have written before about the quality points awarded to wine by many critics (Quality scores changed the wine industry, and created confusion). This sort of thing is not unusual in the modern world, where just about every human endeavor is rated by someone, somewhere; everything from restaurants and hotels to books, music and holidays. The only things that is not rated are the customers, which may explain a lot of the problems.

All of these scores are assigned by individuals, and thus represent a single opinion, whether by an expert or not. However, particularly in the modern world, there are now commercial groups that aggregate what might loosely be called user ratings, to provide some sort of consensus score. The important word there is "consensus", which is a fairly nebulous concept, but which needs to be made concrete if a combined score is going to be reported.

This is the topic of this post. CellarTracker, as but one example, aggregates scores provided by wine drinkers, and provides a consensus score. So, the blog post title translates as: Does the concept of a consensus rating make much sense for wines?


There are many types of ratings, and also many types of consensuses. Ratings can be quite detailed, such as scores out of 100 or 20, or relatively simple, such as 3 or 5 stars, or even binary, such as approve vs. disapprove. How to then get a quantitative consensus out of these ratings is the step being examined here.

For the third type of rating, the usual consensus is simply to report the percentage of raters who approve (or disapprove). For example, in the world of politics, elections are reported as the percentage of voters who chose each candidate, and a "winner" is then declared. Does this approach actually make sense? After all, in a three-candidate race, a winner can be elected even if more than a half of the people disapprove of them. Indeed, in situations where voting is not compulsory, it is commonly reported that only c. 40% of the people actually vote, which does suggest that 60% of people think that the consensus approach is nonsense. It is tempting to see this majority of the people as being the sensible group.

For more complex situations, the consensus is usually reported as some sort of mathematical "average" score. How do we go about combining ratings into a consensus? Does it even make sense to combine user opinions, or expert judgements, in this way? What on earth does such a mathematical consensus represent?

Would I really proceed to work out what color is the sky by trying to combine people's opinions on the matter? Would that really make the sky the same as the calculated consensus color? Or would the consensus be some sort of wishy-washy middle result that does not represent any sky ever observed?

Obviously, I could work out the light's wavelength on the standard color scale, and make lots of such measurements, and then calculate their average. That might make some sense. But that is not what we are doing when we average people's opinions about wines, or about almost anything else.

I think that a reasonable case can be made that individual opinions about wines do matter, but a consensus of those opinions does not mean much at all. These aggregation sites are likely to be wasting our time, as well as their own.


Leaving aside these philosophical questions, we could, of course, proceed by simply calculating an average rating of whatever product we are interested in. Mathematically, there are actually three quite different ways to calculate an average: the mean, the median, and the mode. You are all familiar with the mean, since it is far and away the most common calculation method when people refer to an average.

However, the median (which is the middle value when the scores are arranged in increasing order) makes much more sense as a consensus. This situation is explicitly acknowledged in economics, where an average salary, for example, is calculated as the median. This takes into account the obvious bias introduced by billionaires and the like. If Bill Gates walks into the room, then the mean salary in the room would make us all millionaires, but the median would not change much, if at all.

Is the median used by sites like Amazon and CellarTracker, who provide us with consensus ratings? Not usually, although CellarTracker actually does (as illustrated above). We usually get the mean, instead, which has many odd mathematical characteristics.

The mode can also make much more sense than the mean, as a consensus score. This is defined as the most commonly reported score among the ratings being collated. The problem here is that you need a lot of data to actually calculate the mode; and so we rarely see it, in practice.


For ratings using a small number of stars or the like, a better approach would be to simply report all of the individual ratings, and that is what many aggregators actually do. This focuses attention on the individual ratings, not on their consensus.

However, the individual ratings only make much sense if they are accompanied by written comments. Sadly, even then we may not get much information. After all, we all know what a 5-star review is going to say — the person loved the holiday, or whatever else they are rating. We also know what a 1-star review is going to say, although, rather bizarrely, this will combine the true 1-star ratings with the 0-star ratings — why do we live in a world that pools these two quite different things together?

So, the real information resides in the comments from the 3-star and 4-star reviews, where something was not quite right, and we need to work out whether this particular thing would affect us or not. Someone else's dislike might not be our own, after all.

I well remember once trying to find a place for my wife and I to stay in Sicily. One place I looked at had an average rating (3 stars), but when I looked at the comments I realized that this was a perfect example of why a consensus rating means nothing. Half the reviews said that the place was great and the other half said to avoid it completely. So, the average of "good" and "poor" is (mathematically) "okay". Not likely, I say.

After reading the user comments, I realized that the issue was that the good reviews all occurred when the manager was on site, and the poor reviews occurred when the manager was absent — when the cat is away ...

Anyway, we did not stay there, because I could hardly ring up and ask if the manager was going to be there, just for my holiday. So, we chose to stay somewhere else (shown above); and it turned out to be one of the most memorable experiences of our trip. We were the only people there that night; and, in Sicily, mamma's home cooking leaves every Michelin starred restaurant for dead. I can still remember it all.

So, what is the problem here? Well, the words in most short wine reviews are piffle. Indeed, winespeak is frequently laughable. User comments are more useful than points, except in the wine industry, which is rather a sad thing to say. The most useful information the reviews could provide is what food to drink the wine with, and when — and we do not get even that information, very often. Whether the wine is value for money would also be useful, but that information is even rarer (Calculating value for money wines).

So, what practical use, then, is CellarTracker's score?

Monday, November 23, 2020

France's changing vineyards

In a recent post, I discussed Germany's changing vineyards, as an example of an Old World wine-producing area that has changed considerably since 1960, in terms of both vineyard area and grape varietal composition. Previously, I had also discussed Australia's changing vineyards, as an example of a New World wine-producing area. In this third part of the series, I briefly look at France, as another example of an old-established vineyard area.

As before, the data come from the Global megafile, National 0920 spreadsheet file of:
Anderson, K. and S. Nelgen (2020) Which Winegrape Varieties are Grown Where? A Global Empirical Picture (revised edition). University of Adelaide Press: Adelaide.
These data consist of the estimated growing areas for each of 263 grape varieties, for each decade or so from 1960 to now — 1958, 1968, 1979, 1988, 2000, 2010, 2016. Sadly, the data for the first four years are severely limited. The data totals appear to be complete, but there are individual data for only the same 31 varieties for each of these years.

That is, most of the varieties are simply lumped together under "Other red varieties" and "Other white varieties", which account for 25–65% of the vineyard area (this varies between the four years). This clearly affects my ability to analyze at least 23 other varieties (all of these have high values in the final three years); and it also seems to affect at least another 25 varieties, but to a lesser extent.

France's changing vineyard area

So, I cannot perform all of the analyses that appear in my previous two posts. In particular, there can be no analysis (except for the three final years) of the Diversity Index, or of the Network of grape varieties. This is why I presented the analysis of Germany first — I could at least do it properly.

However, we can look at two aspects of the time-series data, as shown by the two graphs.

First, note (in the first graph) that there has been a continual decrease in total vineyard area in France over the past 60 years. This amounts to 635,000 hectares, which is 44% of the area covered in 1958. Most of this decrease occurred before 1990 (37 of the 44 %), but the decline has slowly continued since then. This slow recent decline matches the slow decline of the German vineyard area over the same period.

Overall, this is a massive decrease, especially compared to other countries in Europe, and I have not seen it commented upon much in the media. This decrease has, as far as I know, been mainly among the so-called "country wines", rather than occurring in the prestige vineyard areas. There is plenty of bulk wine in the world, but the French staunchly maintain that their premium areas are unmatched elsewhere.

France's changing vineyard composition

Second, we can note (in the second graph) a change in the composition of the varieties through time. The national vineyard area is dominated by red varieties (the red line), as we might expect, although this can vary a lot locally. However, there was a slow increase in the reds until 1990 (and a corresponding decrease among the whites), followed by an on-going reversal since then. Indeed, the whites now occupy a greater percentage of the area than they have at any time time in the past 60 years (34%). Perhaps the French are slowly coming to value their white wines, as has the rest of the world?

It is sad that the data do not allow me to analyze this latter trend in any more detail.

Monday, November 16, 2020

How close are repeated wine-quality scores?

This year has been awkward for the purchase of wine, as many people have commented in the media. We can all trust a global pandemic to disrupt international, national and local events.

Here in Sweden, our national liquor chain (Systembolaget) has had to change the way it releases new wines, in order to maintain social distancing among their staff in the main warehouse. This has meant that the wine release schedule has been drastically changed from the previous steady state.

In turn, this change has affected the wine commentators, as well as the wine customers. In particular, the commentators have almost all had problems at some time during this year, tasting new wines and publishing their subsequent quality assessments. For example, both BKWine and Vinbanken did not publish their usual wine-quality scores for the new releases during May and June (I have used these score sources in previous posts; eg. Are wine scores from different reviewers correlated with each other?).


While compiling this year's new-release scores for an upcoming post, I noticed that for the BKWine reports a few of the wines appeared more than once (ie. in the reports for different months). Moreover, some of these repeated wines did not receive the same scores. This situation allows us to comment quantitatively on the repeatability of wine scores from the same person.

This is a topic that I have commented on before, notably in my post on: The sources of wine quality-score variation. In that post, I briefly discussed a dataset from Rusty Gaffney, who re-tasted 21 Pinot noir wines 16-26 months after their first tasting.

Well, in the current case, we have a much shorter period of time than that; and this gives us a much better insight into the process of scoring wines, which is, after all, rather subjective. The August and September BKWine commentaries were published much later than usual, in the middle of the month rather than at the beginning. This presumably reflects pandemic-induced problems, which lead to what is presumably an unintended mix-up. The same person was responsible for all of the actual wines scores (Jack Jakobsson).

This graph shows us the scores for those 16 wines that were repeated in both the August and September wine commentaries. Note that the scores have a maximum of 20 points (not 100).
Repeated wine-quality scores from BKWine


Only 4 of the 16 wines have the same score on both occasions; but 12 of them are within half a point (the smallest possible difference). However, 3 of the wines have a difference of 1 point; and 1 wine differs by 1.5 points. Nine of the wines had an increased score on the second occasion, while only 3 decreased. Circa 39% of the variation in scores is shared between the two occasions.

The differences in scores are somewhat disappointing. Although the similarities are much better than would be expected from random chance (p<0.01), we are still faced with a situation where the differences are slightly bigger than the similarities.

However, this situation is vastly better than the previous one that I reported (see above), where only 6% of the variation in scores was shared between the two occasions (which were much further apart in time, of course). Tastings close together in time are expected to be more consistent; so we at least get that.

Monday, November 9, 2020

A problem for wine-makers in Sweden?

We have all heard that cellar-door tastings have been seriously affected by the current Covid-19 pandemic. This has caused especial consternation among wine-makers who rely on direct-to-customer sales. It has also, of course, annoyed those who wish to partake in wine tourism.

However, this has not been the slightest problem here in Sweden. This is because here is no such thing as cellar-door sales. As I have noted before (Why are there wine monopolies in Scandinavia?), all retail sales in Sweden must go through a single liquor chain, called Systembolaget. This means that, while a winery might be able to conduct tastings, they have to send potential customers down the road to the nearest shop.

Now, obviously, this annoys the pants off the wine-makers. As a personal perspective on the current situation, below is a translation of a recent article about this, from my local newspaper, Upsala Nya Tidning (published 2020-10-25, on p. 14). The original article is called Vinodlaren om förbudet: "Skadar landsbygden", about one of my two local grape-growers (in the province of Uppland). The grape-grower involved is pictured below next to his vineyard (but that is the neighbor's mansion in the background).

Björn Wollentz, and Lillhassla vineyard

In fact, objections to this situation have been raised for a long time. For example, back on 11 September 2012 there was a national media report headed Wine growers rebel (Vinodlarna gör uppror), about the same topic:
The Swedish wine growers are on the warpath. If there is no decision on farm sales, they plan to defy the law and sell anyway ... It was at the annual meeting during the weekend that the Swedish wine growers decided to investigate the possibility of opening their farms for sale despite the ban ... The wine growers also intend to contact microbreweries and fruit wine sellers to try to get them involved in the protest.
The question of whether Swedish wine growers should have the opportunity to sell wine on the farm has been discussed since the 1990s. The Riksdag [national parliament] has at times had the matter up for consideration and in 2007 appointed a special inquiry which came to the conclusion that the sale should be allowed. But the objections have been many ...
... the purpose of the action is to bring about a judicial review that can go to the European Court of Justice. There, there may be an examination of the entire Swedish sales monopoly that exists through Systembolaget.
This situation has not been changed, until recently. The European Union is unlikely to have an official position on the matter, as this is solely a national affair. This EU attitude differs importantly from their 2007 decision against Systembolaget, which concerned transport of alcohol among EU member countries (not sales within a single country). In the 2007 case, a single Swede (Klas Rosengren) simply tried to import some Spanish wine from a Danish retailer, knowing full well that this action would result in a court prosecution, and thus get the situation reviewed. This action was ultimately successful in getting things changed (see the final EU Court of Justice report).

A current possibility of a change in the situation for farm sales in Sweden has actually made it into a recent article in The Economist (The state's grip on grape and grain: a proposal to water down Sweden’s state monopoly on booze). Sweden currently has a minority government (which is not unusual), based on a group of several political parties having a loose agreement among themselves (none of this two-party system for Swedes, like some other countries have!). In this case, the Economist article notes that the moderate Centerpartiet (Center Party) has negotiated a consultation about the introduction of farm sales, in exchange for their parliamentary support of the minority government.

We shall see what happens with this consultation, which will take at least a couple of years. Mind you, wine-making in Sweden is not yet ready to take on the wine-makers of the rest of the world. The 500 SEK bottle of Swedish wine referred to below would cost less than half of that if it came from anywhere else.


The winegrower about the ban: "Damages the countryside"

By Jennifer Berg Eidebo.

Two years ago, Björn Wollentz started a winery in Häggeby. In the future, he wants to sell sparkling quality wines, but he is afraid that the regulations regarding alcohol sales will put a dent in his wheel.

Lillhassla winery is located in Häggeby, next to Ekoln. There are 2,500 vines that will become sparkling wines for around 500 SEK [US$55] per bottle. There are still a few years left until the wine willbe ready for sale, but already now the wine grower has encountered problems with Swedish bureaucracy.

"Everything from writing the place on the label to being able to offer farm sales takes place naturally in other countries, but not in Swedish bureaucracy," says winemaker Björn Wollentz.

It is, above all, the rules regarding farm sales that create problems.

"As I understand it, you can offer tasting, but if someone wants to buy a bottle and take it home, it's no go."

Swedish producers of alcohol who want to sell their products may do so via Systembolaget or restaurants with an alcohol license. But Wollentz wants to be able to sell directly from the vineyard, preferably in collaboration with other local producers.

"Being able to offer wine tasting with bread from local bakeries and cheese or jam from local producers, would create a synergy. It should be in the public interest, so that the countryside does not become depopulated."

Wollentz plans not to live on his wine production, but works everyday at the Ministry of Defense. He has also not dared to take out a loan, but is responsible for all of the investments himself.

"I am very worried about what sticks the regulations will put in the wheel."

Wollentz thinks that farm sales should be allowed, without harming public health. Another one who is on the same track is Björn-Owe Björk (Christian Democrat political party), responsible for regional development in Uppsala County.

"I think farm sales should be allowed for wineries or breweries. It would be good for the countryside and small businesses."

The issue of farm sales has been raised several times over the years. In 2020, the issue will be investigated again.

"It is positive that the government will investigate it again, because it means that somewhere the ambition exists to bring about a change," says Björk.

Wollentz is confident that he will be able to find solutions for his wines, even if farm sales are not allowed in the future. But he believes it will harm the countryside.

"We will get sales for our wines, but there will be no synergy in the countryside. This is still Uppland's first winery, at least since the Viking Age [official end: 1100 CE]," says Björn Wollentz.

Monday, November 2, 2020

Germany's changing vineyards

In a recent post, I discussed Australia's changing vineyards, as an example of a New World wine-producing area that has changed considerably since 1960, in terms of both vineyard area and grape varietal composition. [Note: one late part of that post has since been modified, due to an error in calculations.] In this second part of the series, I look at Germany, as an example of an old-established vineyard area.

As before, the data come from the Global megafile, National 0920 spreadsheet file of:
Anderson, K. and S. Nelgen (2020) Which Winegrape Varieties are Grown Where? A Global Empirical Picture (revised edition). University of Adelaide Press: Adelaide.
These data consist of the estimated growing areas for each of 107 grape varieties, for each decade or so from 1960 to now — 1964, 1972, 1979, 1991, 2000, 2010, 2016. Sadly, some of the data are reported to be missing for the first four years. However, even more obvious is that there are no data for the Gewürztraminer variety in either 2000 or 2010, even though there are hundreds of hectares reported for the other years  — I had to mathematically interpolate the missing data, in order to do my calculations.

We can start to look at the time-series data by noting that there was a continual increase in total vineyard area in Germany until 1990, as shown in the first graph (click to enlarge). This coincided with the rise in global popularity of German wines, the whites in particular — if there are customers, then the producers will provide the required product. Since 1990 there has been no further increase, and even a slight (<10%) decrease in recent years.
Changes in Germany's vineyards since 1960
As expected there has been a steady increase in the number of varieties reported, as well, although this may at least partly reflect better data collection in more recent years. In 2016, four times as many varieties were reported compared to 1964 (103 versus 24).

Of more interest, though is the varietal make-up of these grapes. As in my previous post on grape varieties, I have used a network as a convenient pictorial summary of the complex varietal data. As before, I first calculated the percent of each grape variety as a component of the viticultural area, and then calculated the similarity among pairs of years using the Bray-Curtis measure (important, because it ignores varieties that do not occur in either of the two years being compared). I then used these similarities to calculate a NeighborNet network, as shown next. Years that are closely connected in the network (along the edges) are more similar to each other, in terms of their grape varietal composition, than they are to years further apart.

Network of vineyard composition changes in Germany

The basic picture is pretty clear. There was a continual change in the varietal make-up of Germany's vineyards from 1964 (at the top-right of the graph) until recently (2010 and 2016, at the bottom). The years are not especially different from each other in the network, except 1964, and their similarities to the general pattern are most prominent. The biggest change in varietal composition occurred between 2000 and 2010, although this was not due to much of an increase in the actual number of varieties (67 and 75, respectively). Between 2010 and 2016 there was the addition of c. 20 new varieties, mostly at 1-2 ha, although Cabernet Carol and Cabernet Cortis had 34 and 59 ha, respectively. Obvious candidates like Grüner Veltliner (so abundant in neighboring Austria) also arrived (14 ha in 2016).

Since Germany is famous for its white wines, we might expect white-grape varieties to dominate; and this is indeed so, as shown in the second graph above. However, even more obvious is the rise of red-grape varieties since 1980. These particularly include Dornfelder (now 8,000 ha versus nothing in the 1960s) and Pinot Noir (11,500 ha versus 1,800), but also Blaufränkisch (1,700 ha versus 400) and Regent (2,000 vs 0).

Dornfelder is a complex hybrid developed in Germany in the 1950s, with both Pinot Noir and Blauer Portugieser in its ancestry (the latter of which has actually decreased in area by 40% since the 1960s). Pinot Noir is, of course, the world's most popular cool-climate red grape variety. Actually, other members of the Pinot family have also increased in area in Germany during the past half-century, including Pinot Blanc, Pinot Gris, Pinot Meunier, and Pinot Noir Précoce.

Note that the increasing prominence of red-grape varieties does not really coincide with the increase in total vineyard area, indicating that the red-grape vineyards were mostly not added, but instead white-grape vineyards were replaced. The most notable decrease occurred in Silvaner (now 5,000 ha versus 18,000 in the 1960s). Interestingly, several other white-grape varieties increased from the 1960s to 1990 and then decreased again, notably Müller-Thurgau (from 14,000 ha in the 1960s to 25,000 in the 1980s, and then back to 12,000 in the 2010s), but also Bacchus, Huxelrebe, Kerner, Morio-Muskat, Ortega, and Scheurebe. Note, also, that the recent decrease in vineyard area (graph 1) coincides with a plateau in the varietal composition (graph 2). This is presumably not actually a coincidence.

All of these changes have lead to an increase in varietal diversity in the German vineyard area, as shown in the third graph above. To illustrate this, I calculated Simpson's Diversity Index for each year, which is a formal measure of such things. Low index values indicate dominance by very few varieties, while higher values indicate much more equal abundances among the varieties. In this case, there has been a continual increase in diversity for 50 years. Notably, the varietal predominance of the triumvirate of Müller-Thurgau + Riesling + Silvaner decreased from 75% of the German vineyard area to 40%. This is still an awful lot of area for only three varieties, of course (but they do make very nice wine).

Presumably, things will continue to change in Germany, given the government's stated intention to be proactive about response to climate change (No wine left behind: how Germany wants to adapt to climate change).

Next time, I will look at France, as the most prominent of the Old World viticultural areas.