Monday, April 25, 2022

Wine recommenders are no longer people

Back when I was young, getting a recommendation for a good wine involved walking into a shop full of bottles, and being pounced upon by an eager young man, who tried to help. If you were in a restaurant, instead, then a man in a suit suggested the sort of wine you could never afford. Alternatively, you asked a knowledgeable friend; or you went to an actual winery, to see if they have something you like.

These days, you use an app or a web page. Social interactions have moved in this direction during my lifetime — no longer do we look people in the eye, but we stare at an electronic screen, instead. How does anyone get married, under these circumstances?

Well, there seems to be several quite different ways wine recommendations can be achieved via computing, and I will look briefly at them here. They are not all equally successful (or even easily understood).


First, however, we need to consider what we are using the recommendation for. Am I just looking for a wine to add to my collection, to drink at some future (appropriate) time? If so, then a suggestion from a (suitable) wine commentator is probably what I need. Am I looking for a wine to drink with dinner tonight? In that case it is quite different, because I (presumably) would like to end up with a specific wine, that either goes with the food or impresses the pants off my host. It seems to me that we lump these two extremes together, along with everything in between; and this confuses the issue.

Moreover, there are several possible criteria for choosing a wine. We could, for example,  focus on quality, or we could focus on suitability for the occasion, or we could focus simply on whether we actually like it. These criteria do not necessarily need to coincide in terms of choice, under any given circumstances, although it seems desirable that they should do so as often as possible.

The matter of pairing wine with food seems to be a particular bug-bear. At one extreme, it is perfectly simple. For example, if I am going to eat a light white fish in butter sauce, then a hearty red wine would mean that I can’t taste the fish at all. On the other hand, my wife and I have a Sicilian recipe, originally for swordfish but we use it for tuna — it has a lot of strong ingredients, like tomatoes and capers and celery, and I would never be able to taste a light white wine against it. Pairing does not have to be more complicated than this.

On the other hand, sometimes getting the tastes to pair nicely is more tricky. We also have a duck recipe, with which it is suggested that we pair a Pinot Noir. I could not find one in the house, the first time we tried it, so I chose something else. It was a nice wine; but alternating a sip of this wine with a mouthful of duck did not enhance either product. Next time, I made sure I had a Pinot Noir — and it was perfect.


Given the above considerations, there are basically only four types of information that can be used to recommend a wine:
  • Quality — scores or descriptions
  • Detailed oenological characteristics
  • Similarity to your known wine preferences
  • Similarity to other people’s preferences

All four of these fail, in one way or another. That does not surprise you, because otherwise we would already have a universal wine recommender by now. So, let’s look at them, one by one.

Quality, variously defined, has traditionally been at the heart of wine recommendations, since time immemorial. It is at the basis of all definitions of fine-wine regions, it is the basis of all wine scoring systems, and it usually is the basis of most wine writing. In the past, Baby Boomers preferred scores (It isn’t polite to point), but these days Millennials apparently prefer words (The importance of online reviews). For a range of suitable recommendation sites, consult: The top wine apps to help pick your bottles. Even putting aside the idea that most wines are never reviewed, and that many reviews might be fakes, the basic issue is still that wine commentators all have different opinions, many of which reek of unadulterated snobbery. As a result, the wine industry has a pretty atrocious reputation — trying to reach “normal people” seems to be an uphill battle (How did Australian beer slobs become wine drinkers?).

So, wine recommendations based on quality miss the basic point — most people care more about value for money than about quality per se. There are definitely web sites based on this idea of value, and they have been successful. However, they all cover geographically and temporally restricted locations. For example, I choose newly available wines based on value for money suggestions from local writers here in Sweden — if two of the three agree, then I am likely to try a bottle. This only works here, and it only works for the few weeks that the wines are available. This is not a general recommendation scheme.

The second recommendation approach is to break the thing being recommended down into each of its minute components, and then try to match those components to the recommendation required. For example, The Music Genome Project claims to be the most comprehensive analysis of music ever undertaken, treating it like a study of the genes in your own body. Your genes, inherited from your parents, act together to create yourself; and we therefore are very interested in studying them, to see how they work. The idea is to do the same thing for music, based on all of its acoustic elements; and the same can also be done for wine (eg. Tastry AI).

The basic problem here is that neither music nor wine exist for us outside of our own minds. That is, our ears detect the sound waves of music, but they then convert them into electrical signals, which they transmit to the brain. Thus, it is our brain that interprets the electrical signals, and calls some of them music and some of them cacophony. The same applies to taste. Atomizing music or wine into the basic components that exist before we hear or taste them deals only with the first part of the experience (Why wine is tasted with your brain, not your palate). The role of perception in food and wine evaluation cannot be over-estimated.

The third approach to recommendation is simply to get you to decide what you like (or want) and then find other products that seem to be similar (eg. My WineGenius). That is, you rate your experiences in some way (words or numbers), and this is used to provide you with new suggestions (these days, based on the computer techniques of artificial intelligence and machine learning). This approach suffers from the same problem as above, that our perceptions occur in our minds; and there are lots of extraneous things that can affect what our minds perceive. Different circumstances lead to different perceptions, so that time, place and prior information all play a role in what we think we like, on any given occasion (eg. The taste of wine can be influenced by music; How Brahms might make your wine taste better). What we liked last time is not necessarily the same as next time.

The fourth approach is somewhat similar to the third one, but instead is based on matching you to other people. That is, you all rate your experiences (using words or numbers), and these ratings are then used to match you to other similar people (once again, via artificial intelligence and machine learning). The recommendations then come from things that those other people rated highly but which you have not yet rated.  Perhaps the most (in)famous example of this is movie recommendations from Netflix, but Amazon recommendations work the same way. So, naturally, this approach can be applied to wine (The path to Netflix-quality wine recommendations leads through the doors of perception).

This is sort of how marriages work (as mentioned above). You find someone whom you like, and then see how many things you have in common. The more you find, the better the relationship seems to work, long term. Now, obviously, this analogy breaks down pretty quickly, because married people have to change, in order to adapt to each other’s differences; and the more you adapt, the longer things seems to last. This does not apply to drinking wine, of course, since the wine completely fails to adapt to us. Nevertheless, people are prepared to have a go at this approach, for wine (eg. Clans: The intersection of AI/machine learning with behavioral science).


There are lots of different wine recommendation systems out there; and I have mentioned only a few specific ones here. The problem with all summaries of specific recommender systems is that they have a particular agenda — to sell you their particular recommender (and not any of the others). As such, they are very vague, being based on hyperbole rather than information.

So, should we remain optimistic about the idea of a general wine-recommendation system? Well, I still maintain that if we are likely to get one, then we would already have it by now. Sommeliers have been around for a very long time, and they may slowly be getting replaced by computerized systems. However, that does not make the recommendations any better (or worse). As for myself, I still prefer interacting with people rather than computers.

Monday, April 18, 2022

The wine industry needs to say: “Cheese” (seriously)

I have previously suggested that The wine industry is asking the wrong question, when asking how to market wine to Millennials, rather than asking what Millennials actually want to drink. This is, of course, but one perspective on the general issue that: Global wine consumption has been declining for a long time.

Naturally, there are other perspectives on the global decline of wine consumption and sales. For example, the wineRamp group has noted that “there is a demand problem that needs to be addressed”, and it thus has the explicit objective to: Grow demand for wine in the US marketplace. That is, they wish not only to slow the decline but to positively increase consumption. To this end, they have suggested that It’s time for an American Wine Marketing Board, to oversee the turn-around.


So, what does any of this have to do with the title of this blog post? Well, it is easy to criticize, but the criticism goes down a lot better if something positive is presented, as well. That is, of course, precisely what the wineRamp group is trying to do. The downside of this, unfortunately, is that the people who don’t like the proposed actions will just criticize the proposed alternatives, as well. So be it.

By way of saying something positive in this post, I am going to compare vineyards to cow pastures, of all things. These pastures have cows on them, and cows produce quite a lot of different products. That makes these pastures fundamentally different from vineyards in the wine industry, which mostly just produce grapes for making wine (wine grapes are rarely edible, for example).

The upshot is that cow farmers understand this basic principle: What you lose on the swings you gain on the roundabouts. Consider the following pair of graphs, taken from: U.S. dairy consumption trends in nine charts. The first one shows declining consumption of milk in the USA over the past 40 years, which looks to me almost exactly like the decline in wine consumption in the same location over the same time. The second graph shows the exact opposite trend, with increasing cheese consumption. Both milk and cheese come from the same cows — indeed, the European Union has formally defined cheese as a "matured dairy product", because bureaucrats need such definitions in order to pass laws. *



You can see my point about swings and roundabouts — the cow farmers are doing okay; and, indeed, a third graph (not shown here) shows a general upward trend in “all dairy products consumption, milk-fat milk-equivalent basis, per capita”. So, these farmers can weather the ups and downs of any given milk product. My new question, then, is: what are the wine industry’s roundabouts? What is the wine industry’s equivalent of cheese?

The wineRamp suggestion is basically support for marketing (Wine, the wallflower: Industry honchos plan joint marketing push). This fits neatly into my point in this post: Remember ‘Got Milk?’ As demand shrinks, fine wine needs a USDA Marketing Order. Does it work (cf. “Where’s the Beef?” ; “The Incredible, Edible Egg”)? This marketing approach does not require diversification (eg. fresh milk plus cheese), but simply money. There are, of course, many ways to sell, these days (How Wine Access fled old media tactics and chased consumers into the subscription economy).

There are many future marketing suggestions, after all (Want to fuel your winery’s future growth? Engage women and younger consumers). This may not work, of course, since advertising itself is not always enough (TV wine ads: Black Box targets younger consumers — but still doesn’t get it right). After all, consumer preferences change continually (What wine lovers really want), and not just for wine alone (Wine needs to fit in with changing consumer meal prep.). In short (Wine’s biggest unanswered, market-killing question: Why?):
The currently stalled WineRAMP effort to create a consumer market promotion similar to “Got Milk?” is fundamentally flawed, and would be doomed to failure even if funded, because it lays out no plans for market definition, messaging or other statistically significant, actionable data on the “why” questions.
There are other possible responses to declining domestic consumption, of course. For example, the focus could be on diverse export markets (How wine businesses can prosper in an era of uncertainty):
Business as usual is no longer. For the global wine sector, the uncertainty of the 2020s is in stark contrast to the relative stability of the preceding three decades ... businesses with a broader sweep of export markets may benefit from a portfolio effect which hedges their exposure if one of those markets suddenly deteriorates because of tariffs, economic crises or war.
Another important alternative is to diversify the product itself; and this is, to me, the basic issue of this blog post. What else can the wine industry produce except wine? To use yet another metaphor: if all your eggs are in one basket, then that basket must have no holes. The reality seems to be that the wine industry has an ever-increasing hole in their only basket.


I suspect that there is no industry-wide response to this issue. To quote Rob McMillan (Turning the tables on Rob McMillan — wine industry analyst): “It’s not an easy task finding agreement in this industry.” So, individual producers will need to think about diversification. If, as has been suggested, market share is being lost to the spirits category. then make spirits, as well as wine. If young people are preferentially drinking seltzers, then make seltzers, as well as wine. You get the idea.



* This created problems for Ricotta, of course. This has always been called a cheese, but it is not matured in any way. My wife and I once went to the Ricotta Festival, in Vizzini (Sagra della Ricotta e del Formaggio), in Sicily, and the Ricotta was literally taken straight from the cauldrons and put into our bowls.

Monday, April 11, 2022

The French grape-growers will need to get used to April Weather

We have all read this past week about the frosty weather, particularly in France, but also elsewhere in Europe. This frost followed a period of very warm (summerish) weather, during which the grape vines started to grow. The frost has therefore threatened the new buds and shoots:
The French fruit-growers are not (yet) used to this. Part of the issue here is that the same thing happened last year, as well, causing an estimated €2 billion in damage to the French wine industry. *


Well, let me tell you, here in Sweden this happens every year. Year after year after year. It is situation normal for us. We call it Aprilväder (April Weather) — for example, our weather forecasters cannot yet make a consistent prognosis for Easter (the end of this week). Wikipedia describes the situation like this:
April weather is a changeable weather, where it can be sunshine and summer temperatures one day, and snow, rain, hail and cold the next. This type of uncertain weather is common in the spring, in the northern hemisphere, especially in April, the month when temperatures rise fastest. The sun warms the ground, the heated air rises and forms cumulus clouds. Higher up, however, it is still cold, and the upper part of the cumulus clouds become icy and thus cumulonimbus, which give off rain, snow or hail/snow (which is softer than ordinary hail, but harder than snowflakes).
The current issue for Sweden is that April Weather now turns up in March, as well. This year, in my town, "official" spring was 2 weeks earlier than average (6th March 2022), and at the end of the month (29th) the winter weather returned. We have been dealing with it since then, with the last expected snowfall yesterday and the last freezing temperature later this week (ie. a fortnight of winter, after 3 weeks of spring). **

Mind you, the Wikipedia author does not suggest much in the way of biological consequences for Swedes:
April weather can lead to problems as people have a hard time knowing how to dress best — some people are wearing shorts and a t-shirt while others are still wearing thick winter clothes. In addition, it can cause traffic difficulties if there is snowfall and ice after motorists change to summer tires.
Well, let me tell you that I have the same problems in my garden as the grape-growers do in their vineyards. New buds and shoots on the early plants may be killed, and the growth / flowering / fruiting of these plants can be affected for the rest of the year. My roses and hedge, which are among the earliest parts of my garden to start growing each year, eventually recover; but that is only because I do not expect fruit from them. Fortunately, my two grapevines are protected, because I grow them on my house and garage, where frost cannot get at them.


It is now occurring to the French that this situation might no longer be unusual for them, either:
My own conclusion is also that the conditions down south are no longer unusual, or “freak”, weather. They are the new normal, under climate change. It is not freakish here in Sweden, so the French had better get used to it happening every year, as erratic weather moves further south each spring. This is a tough thing to have to say to agriculturalists; but global climate change is real (How can you doubt global warming?). Moreover, it is the wine industry that has provided some of the best evidence for global warming (Grape harvest dates and the evidence for global warming) as well as the consequent increase in weather variability (Grape harvest dates and year-to-year climate variability).

If these recent spring frosts are, as we expect, a result of global climate change, then we have every reason to ask: Why have we left it so late to deal with climate change in the wine industry? The fires in California (Threats to biodiversity when controlling wildfires), the drought in the western USA (Another drought year in California), and the frosts in Europe ... The list will keep getting longer.

For those of you who have not noticed, the United Nations has just released the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. It notes that governments worldwide have not been meeting their pledges on limits to global warming. That means that none of the anticipated consequences are yet being avoided (Climate report offers some hope, but the need for action is urgent). In short: Get used to it, from now on, because we (globally) are not doing anything like enough to stop it.


The Americans have a thing called (politically incorrectly) Indian Summer. This is a period of warm, dry weather in late autumn (in temperate regions of the northern hemisphere), after the first frost, or more specifically the first “killing” frost. This is thus the inverse of April Weather. From now on, vineyard growing seasons are going to be book-ended by these two weather patterns, at least for the foreseeable future.

There are several possible reactions here, of course. For example, vine pruning can be adjusted to try to delay bud-break, or frost-protection netting could be used (Should frost nets be used more widely in vineyards?) — the latter works quite well in Sweden. As has been suggested on other grounds (Resistant varieties: the next step toward sustainability), some serious thinking might also be done about which varieties or clones are grown where — late-budding clones are going to be in particular demand, I guess. Otherwise, candle sales are going to boom (as shown in the top photo).



Update:

Interestingly, the day after I posted this, the chairman of the French Independent Winegrowers’ Association (Jean-Marie Fabre), announced the same thing (Adverse weather the new normal, claims chair of French wine trade body):
The series of frost episodes shows that these are not isolated, one-off weather incidents over a decade. I am convinced that weather hazards will become increasingly frequent. Freak years will no longer be those where there is adverse weather, but rather those without adverse weather.


* The current news even made it into my local newspaper:
In the champagne region, the winegrowers woke up to minus nine degrees and in Burgundy the thermometer showed -5. And with the drop in temperature came fears of a harvest like last year's, when similar weather cost the country's winegrowers the equivalent of 20 billion kronor.
** Yes, this has created havoc with changing to summer tires — studded (winter) tires are officially supposed to be changed by April 15, but since most people in my town have (quite rightly) not changed to summer tires yet, there are soon going to be very long last-minute queues at the changing stations.

Monday, April 4, 2022

How bad are wine scores, really?

Wine-scoring systems are many and various (see Wine by the numbers); and there is often a lot of cynicism about wine scores, which are the end-result of applying a number to a personal wine-tasting experience. Indeed, some wine writers have specialized in deriding them (eg. Jeff Siegel, the Wine Curmudgeon), and satirizing those wine publications that regularly employ numbers as a means of communication.

Now, there is nothing wrong with numbers per se, as any tax accountant will tell you. After all, finance is based on numbers, although for wine consumers the only financial number is actually the price of the container and its contents.


So, as consumers. we can see things the following way. The wine supplier (whether winery or retailer) applies their own mathematical assessment of the wine quality by deciding on its price. The wine commentator (or critic) then applies their own mathematical assessment of the wine quality by deciding on a score. We get two numbers, not one; and we have to make something of this situation, before purchasing tonight’s dinner wine.

The main issue, then, for us, regarding the application of numbers to an assessment of wine quality is potential bias in the choice of any given number, either by the initial wine supplier or by the subsequent commentator tasting the wine.

For the suppliers, on the one hand, some biases are trivially obvious. For example, the wine will be sold for a number like $9.99 not $10, which apparently really does increase sales. Other biases are less obvious but equally to be expected. For example, since all mass-produced wines taste pretty much the same, I would need to take into account my competitors’ prices when setting my own price, otherwise I am potentially short-changing myself. Mathematically, this called Regression to the Mean, where a set of variable numbers change through time to all end up being the same as the average.

For wine commentators, on the other hand, the most common accusation regarding bias is having a small range of scores, all of which are quite high. So, the common 100-point scale is no such thing. First of all, it starts at 50, not 0, and scores below 80 rarely exist in practice, making it a 20-point scale. Moreover, scores in the range 90–95 are depressingly common, as are scores of 95–98, these days. Apparently high-quality wines all taste the same, just like the mass-produced ones! This is also a form of Regression to the Mean — my scores as commentator need to match those of my competitors, if I am to look credible.

These issues have concerned me before, and I have written about them several times:

This time around, I am going to tackle the professionals. Doing this is tricky, because it requires a fair bit of closely related data in order to detect biases (maybe >500 scores?). However, the kind people at JamesSuckling.com have taken the risk of giving me complimentary access to their site, for which I do thank them. In return, I have exploited their kindness by taking a look at some of their ongoing series of monthly tasting reports for particular wine regions. *



In order to detect bias, we need to compare our data to some sort of mathematical "expectation". In this case, in an unbiased world, the point scores would show a relatively smooth frequency distribution, rather than having dips and spikes in the frequency at certain score values. Mathematically, the expected scores would come from an "expected frequency distribution", also known as a probability distribution (see Wikipedia). In my earliest post on the subject (Biases in wine quality scores), I used a Weibull distribution (see Wikipedia) as being a suitable probability distribution for wine-score data — this simply models the idea that the tasters assign to each wine the highest score that they believe it deserves, and that there is an upper limit to those scores.

In my analysis here, I have included nine reported datasets, with at least 500 wine-quality scores each: Australia (908 wines), Austria (776 wines), Bordeaux (1,447 wines), California (507 wines), Italy (622 wines), New Zealand (760 wines), Sonoma (505 wines), South Africa (570 wines), and Spain (1,470 wines). The frequency-distribution graphs of the scores from each of these nine datasets are included at the bottom of this post.

So, what are we to make of the results? First, this is nowhere near as bad as many critics have suggested for the wine industry as a whole.

However, the most consistent pattern among the graphs is an over-abundance of scores in the center of the distribution — that is, the purple bars are higher than the red ones for scores of 90—93. This applies to all of the graphs except perhaps the one for Austria. I am tempted to interpret this as an example of Regression to the Mean.

However, note that this bias usually results in a lack of higher scores, rather than a lack of lower scores. That is, the upper scores are generally lower than would be expected for a Weibull distribution. We cannot, under these circumstances, accuse the wine-tasters of the bias of "score creep" towards the upper end of the scale.

An exception to this pattern occurs in the Sonoma dataset, where there is an apparent bias towards a score of 97. For Australia, there is a very distinct lack of scores of 95. Speaking as an Australian, I am tempted to see cultural bias here!

Finally, in five of the nine cases there is an over-abundance of 90 scores compared to 89. I have noted before that this is quite common, and perhaps to be expected (Awarding 90 quality points instead of 89). Interestingly, in two cases this bias is combined with a preference for the alternate even number (88), suggesting a distinct bias against 89.


A wine-quality score is a single number, often produced by a single person, although several different people may be involved in any given report, and different reports are usually produced by different people working for the same publication. In each case, all quality ratings are personal, although they are intended to give the impression of objectivity (see The enduring mythos of wine). However, we do (sort of) hope that there is at least a rank order involved in the order of the ratings; if not, then we have quite a serious level of bias, indeed.

However, that level is not what I have been studying here. Nor have I been studying the oft-cited idea that the score is an adjective modifying a written review of the wine (Where wine ratings and masking meet). In some ways this is an odd point-of-view, because numbers are not used in this way in any other context (tax accounting included).

No, instead I have been looking at the wine scores on their own, and assessing whether they can reasonably be interpreted as a sample from a population of numbers produced by some under-lying process (such as actual wine quality). The scores studied here certainly deviate somewhat from that idea. There is apparently a lot of what mathematicians call Regression to the Mean, where scores are closer to the average than would be expected. This is not, in itself a bad thing, although anyone using the scores needs to be aware that it is happening.

There are persistent accusations of inflationary creep in wine scores, so that the 100-point scale is now effectively a 10-point scale (Don't look up! Inflated scores are attacking the wine industry). There seems to be little evidence of that in the datasets studied here. Obviously, I cannot comment on behind-the-scenes shenanigans, which have also been reported (Are wine scores trustworthy?).

Personal preferences can clearly play a role (Why do we want objectivity in wine criticism?), although I hope that most professionals can deal with that (Confronting bias in wine judging). Perhaps the real problem, though, is that a wine-quality score is one number only, even though it represents a cumulation of different tasting characteristics (for alternatives, see: Are we ready for more complex wine scoring?).

Does any of this matter? Maybe not as much as some commentators would like. For example, when making a decision on which wine to buy, apparently only one-quarter of US wine consumers usually consider a rating (A snapshot of the American wine consumer in 2018). If this is true, then wine critics may well be missing the boat. Wine economists, on the other hand, are safe, because they focus on a different set of wine numbers, eschewing the scores pretty much entirely (There should be a statistical approach to the wine industry).



* You would be wise (not cynical) to wonder about my own bias, if I have been given free access to the very thing I am commenting upon. All I can reply is that, for my own part, I really do value my public reputation as a commentator, not only for this blog, but also for my previous one (The Genealogical World of Phylogenetic Networks), as well as for all of my professional scientific publications.



Frequency histograms for each of the nine datasets


The purple bars show the number of wines (vertically) reported for each wine-quality score (horizontally). The red bars are the expected number of wines, based on the Weibull distribution.

Australia

Australia wine scores

Austria

Austria wine scores

Bordeaux

Bordeaux wine scores

California

California wine scores

Italy

Italy wine scores

New Zealand

New Zealand wine scores

Sonoma

Sonoma wine scores

South Africa

South Africa wine scores

Spain

Spain wine scores