Monday, July 30, 2018

What are Australia's most collected wines?

It is sometimes a topic of discussion as to which wines are most likely to be represented in cellar collections. I am not referring necessarily to expensive wines or to cult wines, but to wines that are actually in the cellars of real consumers and collectors (not necessarily investors). Which wines do people actually put aside for drinking (or selling) later?

One could look at the data of a collaborative site like CellarTracker, which lists the cellar collections (number of bottles) of many of their community members. However, this would be a slow business, as the data are not readily available in bulk.

One alternative is to look at the data from commercial storage facilities, given that many people use them when they have a serious size of wine collection. Once again, this would be tricky in a country the size of the USA, where there are many such facilities, large and small scattered across the country.

However, in Australia there is one main company, Wine Ark, which makes such an analysis possible. Indeed, Wine Ark actually does the work for us, by releasing every few years a list of Australia’s 50 Most Collected Wines. Wine Ark was established in 1999, and stores more than two million bottles of wine in 16 cellars across Australia (for clients from more than 30 countries). So, their lists should provide a good overview of what is happening with regard to collections of Australian wine.

I have compiled the data from the four lists published to date (2006, 2009, 2013, 2016), in which the wines are simply ranked in order of the number of bottles in storage. From this, I have constructed a network of the 53 wines that appear in most of the lists.

Australia's most collected wines

The wines at the top of the Ark lists are at the top of the network, progressing down to lower ranked wines at the bottom. The network gets messy towards the bottom because the lower ranked wines can vary a lot in position from list to list — indeed, the lists have changed quite a lot across the decade over which they were compiled.

Nevertheless, there is a group of 14 wines that are at the top of every list, and I guess that we should call these the "most collected" wines. These are all widely available, as their production is relatively large. Notably, 6 of them come from Penfolds, which specifically targets this segment of the wine market. The Penfolds and Wynns companies are both owned by Treasury Wine Estates, giving this conglomerate 8 of the top 14 wines. This may justify their claim to be Australia's premier wine company.

Interestingly, while all of the listed wines are relatively expensive, Australia's most expensive wines (eg. Grange and Hill of Grace) are not necessarily at the top of the list. Indeed, Penfolds Bin 389 Cabernet Shiraz actually topped two of the lists — this is often recognized as the most collected wine in Australia, as it has the same cellaring potential as Pengfolds Grange but costs only one-tenth of the price.

The listed wine with the biggest production is Wynns Coonawarra Estate Cabernet Sauvignon (often called Black Label), which I have written about before (Why lionize winemakers but not viticulturists?). At the other extreme, Australia's cult wines are produced by more low-profile winemakers, such as Chris Ringland, Drew Noon, and Phillip Jones, whose productions are too small to appear in the Ark lists.

It is also worth noting that 10 of the 53 wines (20%) in the network are white, rather than red, including 2 of the top 14. It is debatable whether this is a surprisingly large or disappointingly small percentage of the collectable wines; but it includes 5 chardonnays, 3 rieslings, and 2 semillons (one of them a botrytized version).

Finally, we can compare the network to Langton's Classification of Australian Wine. Langton's Fine Wine Auctions holds more than 250 auctions every year, and their Classification is based on the sale prices of the auctioned wines. The current Classification is from 2014, with a new one due in September this year (the 30th anniversary).

The differences between the two lists are quite revealing. All of the top 14 Ark wines are in the Langton's Classification, although only 8 are listed as Exceptional, with 2 Outstanding and 4 Excellent. Conversely, there are 3 Classification wines that are not in the Ark cellaring list: Bass Phillip Reserve Pinot Noir, Chris Ringland Dry Grown Barossa Ranges Shiraz, and Jim Barry The Armagh Shiraz — these are pretty much cult wines (ie. low production).

Of the remaining 39 Ark wines, 8 are classified as Exceptional, 11 as Outstanding, and 15 are Excellent. That leaves 5 wines in the Ark list that are not in the Classification at all: Jacobs Creek St Hugo Cabernet Sauvignon, Tyrrell's Vat 9 Shiraz, Seppelt Chalambar Shiraz, Howard Park Cabernet Merlot, and Rockford Rifle Range Cabernet Sauvignon.

There are also 29 Outstanding Classification wines that do not appear in the Ark lists. Perhaps the most unexpected of these is the Penfolds Bin 144 Yattarna Chardonnay, which was specifically created by Penfolds as the white equivalent of their red Grange (ie. a cellaring wine — "the result of one of the most comprehensive and highly publicized wine development programs ever conducted in Australia"). Apparently, Penfolds have convinced the auction market (Langtons) but not yet the cellaring public (Wine Ark).

These large differences between the two lists presumably reflect the different attitudes of people who are cellaring wines and those who are selling them at auction. Do you want mature wine to enjoy in your lifetime, or are you treating them as an investment? This may result in you choosing different wines for your storage.

Monday, July 23, 2018

Calculating value for money wines

The prices of wines rarely seem to go down. Indeed, at the top end of the market, they seem to go up rather alarmingly. It has often been noted that the Baby Boomer generation has been willing to pay much more for good wines than the Generation X and Millenial generations are currently doing. This means that the latter groups are looking for wines that are seriously good value for money.

I have previously provided a summary of various quantitative methods for assessing value for money (ie. in addition to personal judgment): Quantifying value-for-money wines - part 1, part 2, part 3 and part 4. The fact that there are seven different methods discussed in that blog series tells you just how seriously people have taken this issue. In all cases, the crucial relationship is between the quality score and the price (QPR) — for any given wine quality we need to estimate the price that is considered to be good value for money.

I thought that it might be interesting to present a real example of one way in which this value can be obtained. Indeed, it is the one that I use myself.

How I do it

Needless to say, in practice I use the method that works best for me. It is described in the post Quantifying value-for-money wines, part 2, although the most detailed discussion is in The relationship of wine quality to price. For it, I need a data set consisting of as many wines as possible, for each of which I have both a quality score and the price — the more wines the better.

I have previously noted that, due to the economics of how the government-owned liquor chain Systembolaget operates, Sweden has mid- and high-quality wines at a cheap price, but has no discount wines (Why is wine often cheaper in Sweden than elsewhere?). This means that I need to optimize the QPR — I can't just go down to a store and see what wines are on special, because there are no such wines. I have also noted the way in which new wines become available (Wine monopolies, and the availability of wine), and that the most interesting wines appear in Systembolaget's "Small quantities" assortment (små partier), which is where I focus my attention.

The wine-quality scores that I use come from Jack Jakobsson, who regularly does the Systembolaget wine tastings for BKWine Magazine. He produces a monthly report of all of the new-release wines that he has been able to taste (some wines are in too small a quantity to be made available to the media). He uses a 20-point scale, including half-points. His scores during 2017 ranged from 11 to 18.5 (see the graph at the bottom of the post). Wines that would score higher may exist, but they are not available for media tastings.

I standardize the prices to those for 750ml bottles of table wine (red, rosé, white, sparkling) — that is, excluding fortified wines, which tend to exist on a different price scale, and also other-sized bottles (halves, magnums, etc).

This first graph shows the scores for the 1,691 "Small quantities" wines that Jakobsson tasted during 2017, with the single-bottle price shown vertically and the quality score shown horizontally (each point represents one wine). You can convert the Swedish crown (krona) into US$ by dividing by c. 9 (eg. 175 kronor ≈ $20).

Wine prices in Sweden during 2017

What I need to do now is derive the QPR relationship for these wines. That is, I need to calculate the "expected" or "average" price-quality relationship. I do this by fitting some mathematical model to the data (as explained in An introduction to data modeling, and why we do it). I have to do this only once, so it it no big deal.

As I have discussed before (eg. The relationship of wine quality to price), an exponential model is usually the best fit to economic data, and this is shown in the next graph. Note that the vertical axis is logarithmic, which means that the model can be represented by a straight line. The fit of the data to the model is quite good (60%), especially compared to some other price-related data sets that I have looked at.

Fitting the exponential model to the wine data

The line on the graph may look like it is a bit too low, but that is only because there is a mass of points in the lower part of the graph — half of the points really are above and half below the line.

Since the fit of the model and data is quite good, we can now proceed to identify the value-for-money wines. [If the fit is poor, then the exercise would become pointless!] The next graph shows three dashed lines, representing three different QPR criteria.

Identifying the value for money wines

The wines below the pink line are the best 5% in terms of value for money, while those below the blue line are the best 10% — these are the wines we should think about purchasing if we want to get the most for our money. The wines above the black dashed line are the worst 5% in terms of value for money — these are the rip-off wines, because I can get the same wine quality for a lot less money.

You will note that the best bargains are usually in the 15-16.5 points range (which is approx. 88-91 points on the 100-point scale). This is a very nice quality range if you happen to like good wines — there is no need for me to pay more than the equivalent of US$25 for a "90-point wine".

The practical result of this analysis is that I now have a separate price noted for each quality score, which I can use to assess the value for money of any new wines that are released — new wines that are selling for less than that price are good value for money. For example, I always take a close look at any new wines that are below the prices represented by the pink line on the graph.


The procedure outlined here possibly looks cumbersome, but it is quite straightforward for anyone used to a bit of quantitative analysis. It works well, in practice; and it could be applied to a compiled set of scores for any set of wines.

Jakobsson's scores

Finally, in accepting to use the scores provided by Jack Jakobsson, it is of interest to look at whether there are any biases in his scores, such as I have discussed in previous blog posts (eg. Biases in wine quality scores). The next graph compares the frequency distribution of Jakobsson's scores (in blue) with that expected from unbiased scores (in maroon).

There are few biases in the quality data

Interestingly, his scores are remarkably unbiased, compared to the situation discussed in my previous posts for other collections of wine scores. There is a slight under-representation of 15.5 compared to a score of 15 or 16, along with a small over-use of 14 and 14.5 compared to lower scores, but that is about all. Perhaps this is a result of using a 20-point scale, where there is no temptation to over-use scores like 90 on the 100-point scale.

Jakobsson also helpfully provides an indication of the likely best period during which to enjoy each of the wines. This is a topic that I will return to in a future post. On the flip-side of the coin, the most obvious downside to his reviews is his apparent disdain for rosé wines — they rarely get good scores, even when they taste pretty good to me.

Monday, July 16, 2018

Differing opinions of amateurs at the same wine tasting

I have previously written about a direct comparison of two professional wine writers tasting the same wines at the same time (Laube versus Suckling — their scores differ, but what does that mean for us?). This begs the question of what happens at wine tastings attended by several people, especially if they are interested amateurs rather than wine professionals. This is a trickier question to address, because of the paucity of published data. However, I will look at one possible dataset here.

On 20 January 2014, at the Ripple Restaurant, in Washington DC, there was a vertical tasting of wines from Château Calon Ségur (located in Bordeaux), for 16 vintages from 1982-2010. This was attended by a number of people, three of whom have put their quality score for each wine online:
  Panos Kakaviatos (Calon Segur 1982-2010: first ever promotional tasting in the US)
  Aaron Nix-Gomez (The Calon Segur vertical 2010-1982)
  Kevin Shin (dcwino on CellarTracker)
We can try to compare these scores.

Furthermore, we can also try to compare these scores to those from a professional wine writer. On 6 November 2013, there was another tasting of many of the same wines, at the Carré des Feuillants restaurant, in Paris. From this tasting, Jane Anson has also put her quality score for each wine online (Chateau Calon Ségur: retrospective 1995-2011). These can be added to our comparison, given that the tastings were only a few weeks apart.

As I have done in previous blog posts (eg. How large is between-critic variation in quality scores?), we can quantify the relationships among the scores using correlation analysis between pairs of tasters. This measures the percentage of the data variation that is shared in common between those tasters — the larger the percentage then the better the agreement there is among the quality scores of the wines tasted. This is shown in the table for all six possible pairs.

Correlations among the wine tasters

There is quite a reasonable degree of agreement among the three wine-interested amateurs, especially Kevin Shin and Aaron Nix-Gomez. Indeed, these percentages are higher than the level I observed among the professional critics in the post cited above (often only 10-40%). Perhaps amateurs are less determinedly different from each other than are professionals?

Notably, the comparisons between these amateurs and the professional (Jane Anson) show much lower agreement. This may reflect the bottles opened at the two different tastings; but the values are certainly in accord with those found among other professionals.

It might be also be useful to look at a picture of these data, rather than a table of numbers. To do this, we can employ a network, as I used in the post on professionals cited above. This is shown in the graph.

Network of the shared wine-quality scores

The important point for interpreting the graph is that the length of the lines represents how similar are the wine-quality scores. The lines in the center represent the shared similarity among the scores, while the lines leading to the names represent the variation in the scores that is unique to that person.

What this network says is that very little of the variation in quality scores is agreed upon by the four people, and that they each have their own personal opinion, which differs notably. In this case, there is little consensus on the quality of the different vintages.

So, amateurs may be somewhat different from professionals, but they still go their own way. Wine quality is apparently not a shared experience.

Monday, July 9, 2018

The ups and downs of wine-blog posting

A couple of weeks ago I wrote a post about How long can wine bloggers keep it up?. At the time, I mentioned that I recorded the number of posts per month for all of the Australian wine-related blogs that I could locate. This allows me to look at changes in the rate of blog posting throughout the life of each blog. In this new post, I will show you some of the more obvious patterns. I will use individual blogs as my examples, but I will group them into sets with similar patterns of posts — what types of wine blogs are there?

The individual qualities of wine blogs have long interested people. For example, back in 2013, Lettie Teague searched for Five wine blogs I really click with. She searched "not just a handful of blogs here and there but hundreds and hundreds of wine blogs from all over the world." However, the fate of blogs is almost always the same — only one of her chosen blogs has posted since the middle of 2017. Unusually, one of the bloggers did actually put up a "good-bye" post (Brooklynguy's Wine and Food Blog).

In my previous post on the subject, I illustrated the coming and going of the Australian wine blogs from the beginning of 2006 until May 2018 (150 months). In all of the graphs shown here, "Time 0" is the time of the first blog post for each blog, so that the graphs illustrate what happened to the blogs through their lifetime. I have excluded the three most prolific blogs, which all started long before 2006 (these would fit into the last two graphs below).

The first graph simply shows the number of blogs (in pink), illustrating that the number of blogs decreases through time (ie. many blogs last a short time and only a few make it for a long lifetime). For the cognoscenti, this is called a Type I survivorship curve (note the logarithmic vertical axis).

Number of Australian wine blogs and their posts

The blue line shows the average number of monthly posts for those blogs still surviving at any one time. The average remains steady at 4-5 posts per month for c. 5 years, by which time the number of blogs has halved. Thereafter, the average becomes much more variable, depending on which blogs are still going. The longest-lived blogs keep up a high average monthly number of posts (eg. >10 years = >10 monthly posts) — if the blogger is still going after 6 years, then they really have something to say!

We can now look at the individual blogs, looking not at how long they last but at what happened along the way. The blogs are arranged in groups, although there is nothing definitive about the following groupings. They are merely examples of patterns that appear in the data. Not all of the blogs are actually shown here.

The next graph shows a few blogs that burst out the blocks with a flurry of activity but then slowed down over the first year, followed by a slower stream of activity.

Australian wine blog postings

The next group of blogs did the opposite, starting relatively slowly but followed by a burst of activity later on. This burst could take up to 2 years to kick in. In all cases the burst was not sustained by the blogger.

Australian wine blog postings

For the next group, each blog shows a series of episodes of bigger activity, rather than a single burst. These bursts usually represent different topics of interest to the writer; for example, reporting on travels to wine regions. It is easy to see these blogs as extensions of those in the previous graph — some bloggers get a second or third wind, but some do not.

Australian wine blog postings

We now move on to a group of blogs that all have regularly had a relatively high number of posts (eg. >3 per week). Some of these bloggers decreased their activity after an initial burst, but they still maintained their prolific rate of posting. For example, at one point Full Pour simply halved the number of posts from one month to the next, but then continued at the new rate.

Australian wine blog postings

The Intrepid Wino was the most erratically posting blogger I encountered — on some occasions wine-tasting notes were uploaded in bulk, with a maximum of 171 posts in one month (off the top of the graph) — the nearest competitor was Wine Will Eat Itself, with a maximum of 98 (see below).

We now move on to those blogs that have consisted mostly of wine-tasting notes. Obviously, these notes are relatively short, and so there can be a lot of posts in any given month — here, we are talking of up to 1 per day, or even more. However, you will note that the bloggers illustrated in this next graph all decreased their activity after an initial burst.

Australian wine blog postings

The final graph shows those blogs consisting mostly of wine-tasting notes but where the number of posts increased dramatically at a particular time. You can all guess what that time was — the blogger started receiving large numbers of wine samples, for free, rather than basing their comments on their own drinking habits or on group tastings. The most blatant example is Wine Will Eat Itself — sadly, here the prolific activity was stopped by the death of the blogger.

Australian wine blog postings

This sort of activity by wine writers has long been questioned. For example, David Shaw wrote a pair of articles for the Los Angeles Times way back in August 1987 (Wine writers: squeezing the grape for news, and Wine critics: influence of writers can be heady), revealing what was then presumably unknown to much of the reading public — many if not most newspaper and magazine wine writers were paid very little money, and relied on wine producers and marketers in a way that could easily be seen as a conflict of interest.

The main issue, of course, is that the writers usually prefer to write favorable reviews, and therefore simply ignore all wines that they view unfavorably. This means that some of the Australian wine blogs simply catalog (mostly) Australia's wines, one bottle at a time, but actually ignoring most of them. This may not be of much help to the reader, who is not being warned about what to avoid.

This also produces an uncritical view of the world. We all know what a 5-star review says before we read it (as we also do for a 1-star review), so why read it? These reviews provide an unrelenting tone, which ultimately becomes tedious. The real interest lies in the 2- and 3-star reviews, because something went wrong, and we need to assess whether it would also be a deal-breaker for us. Wine bloggers, please take note.

Monday, July 2, 2018

An introduction to data modeling, and why we do it

Among all of the current hype about quantitative data analysis (eg. The arms race for quants comes to the world’s biggest asset managers), especially with regard to what are called Big Data, I have noted a few negative comments about the idea of modeling data. Not unexpectedly, non-experts are often wary of things beyond their own expertise. (I know I am!) So, I thought that I might write a post outlining just what people are trying to do when they do the sorts of things that I normally do in this blog.


If the world is a non-random place, then it is likely that we can find at least a few patterns in it that we can describe and explain in a simple way. We refer to this process of description and explanation as modeling. In this process, we try to find simple models that can be used for both describing and explaining the world around us; and, if we get it right, then these models can be used for forecasting (prediction), as well.

This is not data modeling

The main issue is that life cannot be entirely predictable — there are predictable components and unpredictable components. For example, we all know that we have a limited life-span, although we do not know where or when we will depart. Nevertheless, there is a measured average life-span (which is the predictable component), along with variation around that average (the unpredictable component). We are thus all thinking that we might live for 80-85 years, and we can plan a future for ourselves on that basis.

Think of it this way: the predictable component gives us optimism, because we can make plans, while the unpredictable component makes our plans go astray.* Models are our formal, mathematical way of trying to identify the two components. The idea is to find out whether the predictable part dominates, or not. If it does, then forecasting is a viable activity for us all.

Another way of thinking about this is the classic question as to whether the glass of water is half full or half empty. The full part is the predictable component, and the empty part is the unpredictable component. Of course, the glass is both half full and half empty; and we should actually be interested in both components — why is it half full, and why is it half empty? Each will tell us something that might be of interest.


So, models try to formalize this basic idea mathematically. If we have some quantitative data, then we can try to find an equation that tells us about the predictable component of the data, and about how much the real data deviate from the model in some (possibly unpredictable) way. For example, we anticipate that each person's life-span is not random, and we can thus model it by assuming that it deviates in some unpredictable way from the (predictable) average lifespan. Similarly, tomorrow's weather is not random, but instead it deviates from today's weather in more or less unpredictable ways.

To get a picture of what is happening, we often draw a graph. For example, our data might be shown as a series of points, and we can then fit a line to these data. This line represents the model, and the closeness of the points to the line tells us how well our model fits the data. The line is the predictable component, and the deviation of the points from the line represents the unpredictable component.

A couple of examples

Here are two wine-related examples, of a type of modeling that I have used in previous blog posts. In both cases, I will be fitting an Exponential Model to a set of data, as this seems to be the simplest model that fits these data sets well (see the discussion at the end of the post).

The first set of data come from the EuroStat database. It lists the average size of a vineyard holding for the 18 European Union countries with the most vineyard area. Each point in the first graph represents a single country, with the countries ranked in decreasing order horizontally, and the average vineyard size shown vertically.

Average vineyard holdings in the European Union

The line represents our model (the Exponential). Note that the vertical axis is on a logarithmic scale, which means that our model will form a straight line on the graph. Also, the model fits 97% of the data, which means that our model fits very well (see the discussion later in the post).

Using our glass metaphor, the graph shows us that the glass is almost full for all of the countries — the predictable component of the data is by far the largest (ie. the points are close to the line). However, for France our glass is not at all full, and there is a large unpredictable component (the point is not particularly near the line). Both of these conclusions should interest us, when studying the data. We should be happy that such a simple model allows us to describe, explain and forecast data about vineyard sizes across Europe; and we should wonder about the explanation for the obviously different situation in France.

The second example is very similar to the first one. This time, the data set comes from the AAWE. It lists the average 2015 dollar value of wine exports for 23 countries. As above, each point in the graph represents a single country, with the countries ranked in decreasing order horizontally, and the export value shown vertically.

Wine export values per country

Everything said for the first example applies here, as well, except that this time the country with the greatest deviation from the model is the lowest-ranked one. We might ask ourselves: Is it important that Romanian wine exports do not fit? We do not know; but the model makes it clear that we might find something interesting if we look into it. This is the point of modeling — it tells us which bits of the data fit and which bits don't; and either of these things could turn out to be interesting.


There is an old adage that models should be relatively simple, because otherwise we lose generality. Indeed, this idea goes back to Aristotle, although William of Ockham is usually given the most credit (ie. Occam's razor). So, simpler is better. The alternative is called "over-fitting" the data, which is bad.

We could try to model things exactly — for example, we could think in detail about the things that cause weather to vary from today's or people's lives to deviate from the average. However, it should be obvious that this would be unproductive, because there are simply too many possibilities. So, we try to use models that are as simple and general as possible.

The main practical issue is that there are lots of mathematical models, which differ from each other in oodles of way, and many of them might fit our data equally well. This sometimes leads to unnecessary arguments among the experts.

However, we do have various ways of helping us measure how well our data fit any given model. In my case, I usually provide the percentage fit, as shown in the above two examples. This is sometimes described as the amount of the data that is "explained" by the model, although it is better to think of it as the amount of the data that is "described" by the model. Either way, it represents the percentage of the data that the model claims is predictable, with the remainder being unpredictable by the model.

We would, of course, do well to pick a model that provides as high a fit as possible. However, the best-fitting model might actually be over-fitting the data. To guard against this, we should also have some reasonable explanation for why we think that our chosen model is suitable for the data at hand.

Simply trying an arbitrary range of models, and then choosing the best-fitting one, is almost guaranteed to over-fit the data. At the other extreme, simply fitting a straight line to your graph can also be a very poor procedure — and I will discuss this in a later post.

* Robert Balzer: "Life is what happens to you ... when you are planning other things."
[Quote provided by Bob Henry.]