## Monday, March 13, 2017

### How many wine prices are there?

My short answer is "three", as I will explain in this blog post.

It would be nice for the consumer if wine prices varied in some predictable way. From the viewpoint of a data analyst, this implies that there is some specifiable model of price variation that can be used to describe the variation in observed prices.

From the viewpoint of the consumer, having such a model is valuable, because it can be used to make a rational decision about whether a particular wine is a bargain or a rip-off (as explained in Choosing value-for-money wines). That is, models suggest that prices have a predictable component (an "average value") and a random component, and too much deviation from the predictable component indicates either a good deal or a bad one.

I have suggested in previous blog posts that one model that seems to fit wine prices reasonably well is what is known as the Lognormal Model (eg. see The relationship of wine quality to price). This model indicates that prices are expected to increase exponentially in response to winemaker effort. That is, rather than one unit of effort leading to the addition of one unit of price, the prices are multiplied, instead.

One consequence of this model is that prices do vary around some average value but the variation is not symmetrical — prices vary much more above the average than below it. I am sure that you have all noticed this in real life; and not just for wines, but for most consumer products. There are plenty of very expensive wines but not so many very cheap ones.

The most expensive wines are often featured in the media as being "luxury" products, beyond the purchase ability of mere mortals. On the face of it, it seems unlikely that the pricing of these wines is in any way related to the pricing of the wines bough by the rest of us. Recently, Thach, Olsen, Cogan-Marie & Charters suggested dividing these wines into the following price categories per bottle (see What Price is Luxury Wine?): Affordable Luxury (US\$ 50-100), Luxury wines (US\$ 100-500), Icon wines (US\$ 500-1,000) and Dream wines (US\$ >1,000). Below these, we might also recognize Everyday wines (US\$ <10), Better wines (US\$ 10-20) and Premium wines (US\$ 20-50).

Maybe there are different pricing models for each of these wine groups? I thought that it might be worthwhile to try modeling some real data, to see how all of these ideas fit together.

Systembolaget

The data come from the online database of the national liquor chain in Sweden, known as Systembolaget. Being government owned, the complete product information is freely available, as both an XLS file and an XML file. I have used the prices of the bottled red wines that were available in May 2016, when there were 5,487 such wines listed in the database. [Note that bag-in-box wines are not included.]

This first graph shows the frequency distribution of how many wines (vertically) fit into each of the bottle prices (horizontally). [Note the logarithmic scale for the prices.] Obviously, the prices are given in Swedish crowns (SEK). To convert to other common currencies you can divide the SEK by approximately 10 — if you want more precision, for USD divide by 9, for EUR divide by 9.5, and for GBP divide by 11.

The graph is rather spiky, because of the worldwide practice of setting prices at particular "desirable" values (99, 149, 199, etc), but there is otherwise a clear general pattern to the data. The minimum price is 40 SEK (US\$ 4.50) and the maximum is 22,500 SEK (US\$ 2,500). The most common price is 100 SEK (US\$ 11), with the median at 190 SEK (US\$ 20) — that is, half the wines cost US\$ 20 or less.

Now, we can compare these data with the pricing categories outlined above. This next graph superimposes them onto the frequency distribution.

We can then see how many wines fit into each category:
 Everyday wines Better wines Premium wines Affordable Luxury Luxury wines Icon wines Dream wines 558 2,070 2,045 538 246 19 11
Not unexpectedly, there is a dearth of the cheapest wines, which mainly sell in cardboard boxes, not bottles. However, the other wine price categories are well represented in Sweden. Moreover, the wines themselves are the ones typically available in other Western countries.

Modeling the price data

The frequency distribution does, indeed, look very much like what would be expected for a lognormal model. So, we can start or analysis by trying to fit the data to such a model.

This next graph shows the probability distribution of the best-fitting lognormal model in pink. [Technical note: the model was fitted using maximum likelihood, with the Regress+ program; I subtracted 35 SEK from each price before fitting the model, and then added it back for the probability distribution.]

In this sort of analysis, we interpret the pink line as representing the predictable component  of the variation in wine prices, and the differences between the blue line and the pink line represents the random component of the wine prices.

This lognormal model fits the price data reasonably well, and the predicted parameters are close to the observed ones (eg. the predicted median and mode are within 7% of the observed values). However, the fit is not good enough, because the pink line (the model) deviates too far from the blue line (the data) in too many places.

I suggest that the basic problem here is that we are trying to fit a single price model to the data, when the data actually follow several different models.

First, the highest prices do not fit the model, notably those for the Icon wines and Dream wines. It should not surprise us that these wines have their own price structure, which comes from cloud-cuckoo land, and is tolerated only by businessmen with more money than common sense. So, we should not be trying to cram these wines into our model. Indeed, there is a big price gap at US\$ 500, and so it is easy to decide where to divide the model in two — we simply drop the Icon and Dream wines from our analysis.

Second, it looks to me like the remaining frequency distribution (US\$ <500) still cannot be modeled by a single lognormal model. That is, the shape of the pink line cannot be made to move closer to the blue line. Even these data are more complicated than our simple lognormal model.

So, let's try fitting two lognormal models to the data, instead. This is straightforward to do; and it implies that the wine prices actually live in two different worlds. This next graph shows the probability distributions of the two best-fitting lognormal models in pink, along with their combination shown in black. [Technical note: the Akaike Information Criterion for two models versus one improves by 231.7 units.]

The black line clearly fits the blue line much better than did the pink line in the previous graph. So, this is now a very good fit of the data and the overall model. We could, of course, keep fitting even more complex models, but there seems to be no good practical reason to do so. [Technical note: by definition both lognormal curves must start at the same left-hand point on the horizontal axis. This is an unfortunate inadequacy of using the lognormal, because the smaller curve would make more sense if it started further to the right. That is, a model for the more expensive wines should not start at US\$4!]

This model means that the two pink lines represent two different pricing structures for the red wines. One of the structures covers the cheaper wines (principally US\$ <20) and the other one covers somewhat more expensive wines (US\$ 20-500). Roughly speaking, there is one pricing structure for the Everyday and Better wines, and another one for the Premium, Affordable Luxury, and Luxury wines.

The majority of wine consumers are likely to be dealing with the first model, which actually covers 65% of the red wines. However, the cognoscenti will principally be interested in the second model, which covers almost all of the remaining 35% — these are the wines that the wine media tend to write about most often. The Icon and Dream categories, with their own (third) model, include only 0.5% of the wines.

This has important practical consequences for buying wine. For example, there is little point in a consumer trying to compare the quality:price ratios (QPR) of US\$10 wines and US\$50 wines, because they probably won't be the same — the pricing structures are not actually connected. You need to be wearing a different hat when you shop for Premium wines rather than Everyday wines!

Conclusion

For these data (bottled red wines available in Sweden), there appear to be at least three price models. One model covers the most expensive wines (US\$ >500), one principally covers the cheapest wines (US\$ <20), and the third model covers most of the wines in between. Price variation within each of these models is unrelated to price variation within the other models.

In practice, this means that price comparisons only make sense within each of these three groups — for example, the quality:price ratios (QPR) will be different between the groups. There seems to be no good reason why these conclusions would not apply to the wines sold in other countries with similar wines available, as well, although the details may be different.

1. This comment has been removed by the author.

2. [Preceding comment was deleted due to a typo.]

Quoting Oscar Wilde:

“What is a cynic? A man who knows the price of everything and the value of nothing.”

ON THE SURPRISINGLY LOW COMPARATIVE COST OF PRODUCING “EVEN THE [WORLD’S] BEST WINES . . .”

Excerpt from The Atlantic Magazine
(December 2000, Page Unknown):

“The Million-Dollar Nose”

By William Langewiesche

. . . For those in the business, maintaining that [elite drink] image is important not only for commercial reasons but also for reasons of personal prestige. Every stage of the trade is involved in establishing the high prices, but ultimately those prices can be sustained only through the retailers and their sales efforts. The problem for the retailers is that wine -- unlike luxurious hotel rooms and other hyperinflated products generally covered as business expenses -- is usually paid for directly out of the consumer's pocket. This makes for a scary business, especially toward the high end, where The Wine Advocate roams.

The truth is that even the best wines cost only about \$10 a bottle to produce, and they are not inherently rare. If the initial cost is tripled to allow for profits along the path of distribution, one can reasonably conclude that retail prices above \$30 are based on speculation, image, and hype. . . .

ON THE SUBJECT OF LUXURY GOODS PRICING . . .

Veblen goods – http://en.wikipedia.org/wiki/Veblen_good

[Excerpt: “Some types of luxury goods, such as high-end wines, designer handbags, and luxury cars, are Veblen goods, in that decreasing their prices decreases people's preference for buying them because they are no longer perceived as exclusive or high-status products.”]

Giffen goods – http://en.wikipedia.org/wiki/Giffen_good

[Excerpt: “Some types of premium goods (such as expensive French wines, or celebrity-endorsed perfumes) are sometimes claimed to be Giffen goods. It is claimed that lowering the price of these high status goods can decrease demand because they are no longer perceived as exclusive or high status products.”]

3. Excerpt from The Sacramento Bee [California] “Business” Section
(February 14, 2008):

“Full Bouquet on Wine Costs;
From grapes to glass, prices vary by region and quantity”

By Jim Downing
Staff Reporter

“Breaking Down a Bottle”

The value of wine grapes depends on where they’re grown. While grapes are the primary ingredient in wine, they make up only a splash of a bottle’s retail price. Here’s a breakdown of the estimated costs in a typical \$20 bottle of wine:

Grapes . . . . . . . . . . .\$ 1.95 Petite Sirah (Mendocino)
Winemaking ops . . . \$ 3.25 medium-volume
Oaking . . . . . . . . . . \$ 0.75 American oak barrel
Bottle glass . . . . . . . \$ 0.90 Midrange glass
Label . . . . . . . . . . . . \$ 0.25 Midsize order
Closure (cork) . . . . \$ 0.30 Midquality cork
Capsule . . . . . . . . . \$ 0.10 Aluminum
Bottling . . . . . . . . . . \$ 0.45
Subtotal . . . . . . . . . \$ 7.95

Winery mark-up . . . +75%
Winery mark-up . . . +\$ 5.96
Subtotal . . . . . . . . . \$13.91

Wholesaler m-up . . +20%
Wholesaler m-up . . +\$ 2.78
Subtotal . . . . . . . . . \$16.70

Retailer mark-up . . . +20% supermarket
Retailer mark-up . . . +\$3.30
Total . . . . . . . . . . . . \$19.99

Sources: Sacramento Bee [newspaper]; Robert Yeltman, UC Davis; National Agricultural Statistics Service

4. Same Sacramento Bee newspaper article. Same sidebar.

\$80 bottle

Grapes . . . . . . . . \$ 5.75 Cab Sauvignon (Napa)
Winemaking ops . . \$ 6.25 small lots
Oaking . . . . . . . . . \$ 2.00 French oak barrel
Bottle glass . . . . . \$ 2.00 Heavy European glass
Label . . . . . . . . . . \$ 0.65 Small order, fancy label
Closure (cork) . . . . \$ 1.00 Highest-quality cork
Capsule . . . . . . . . \$ 0.18 Tin
Bottling . . . . . . . . \$ 0.50
Subtotal: \$18.33

Winery mark-up . . . +150% Small, renowned winery
Winery mark-up . . . +\$27.50
Subtotal: \$45.83

Wholesaler m-up . . . +35% Low volume = high m-up
Wholesaler m-up . . . +\$16.04
Subtotal: \$61.86

Retailer mark-up . . . +30% Wine shop
Retailer mark-up . . . +\$18.13
Total: \$79.99

Sources: Sacramento Bee [newspaper]; Robert Yeltman, UC Davis; National Agricultural Statistics Service