One does not have to spend more than a few seconds reading wine writing to realize that this is not normal human discourse. The writing style used for descriptions of wine seems to cover the full gamut of human expression with the sole exception of normalcy. That is, it covers hyperbole, flowery, literary, allusive, obscure, flamboyant, and sometimes (sadly) even pretentious.
If you want a good laugh, then plenty of people have ridiculed the style of wine descriptions, in everything from books to cartoons. You could try The Complete Guide to Wine Snobbery, if you want a well-written and educational introduction to the topic (by a New York children's-wear manufacturer co-incidentally named Leonard Bernstein); or if you want a rough-around-the-edges look at the topic then try The Illustrated Winespeak (by cartoonist Ronald Searle).
Way back in 2007, Mike Steinberger wrote about Why wine writers talk that way. He started by quoting Fran Lebowitz (well-known for “her sardonic social commentary on American life”), to the effect that: “Great people talk about ideas, average people talk about things, and small people talk about wine.” Steinberger was not impressed with where this must leave people who talk about wine writing.
Well, I have been interested to write a data-analysis blog post about wine words, and wondered how I might do it, while simultaneously avoiding Ms. Lebowitz's sharp tongue. I stopped worrying about it when I realized that someone had already done it for me, and so I could hide behind him, instead.
The basic idea is that there are often commentaries to the effect that the modern idea of giving wines a quality score may not be the best way to discuss wine, and that the descriptions are better. This leads to the obvious query about whether the points and words are actually related.
This has been looked at by Olivier Goutay (2018): Wine ratings prediction using machine learning. He studied the online wine reviews from Wine Enthusiast for the years 1999–2017, totaling c. 92,400 unique wine reviews. The data of interest, for each wine, concern the quality score reported plus the word description.
The first thing he notes about the relationship of points and descriptions is that the higher the points the longer the description. This is shown in the graph here, reproduced from the original.
The quality points are shown horizontally, and each score has a separate box-and-whisker plot above it, showing how long are the descriptions (ie. the number of words). The boxed area shows the range for the middle 50% of the description lengths, with the horizontal center-line indicating the median (50% of the lengths are above the median and 50% below). The vertical line (whisker) on each side of the box indicates the range of most of the rest of the lengths. However, unusual (outlying) values are shown by individual symbols.
The lengths of the descriptions increase continuously from 80 to 95 points, more than doubling from 15 words to 35. This is somewhat reassuring — higher points should indicate a better quality wine, which is then worth a few more words, showing appreciation. The lengths plateau after 95 points, which may also be reassuring — editors must realize that there should be a maximum length to wine descriptions! The sudden increase in length at 99 points (40 words) and 100 points (45 words) is not unexpected, I guess, as the writers wax lyrical about perfection.
Olivier Goutay's purpose in his data analysis was somewhat more ambitious than this: Is it possible, through the computer technique called machine learning, to predict a wine rating (in points) based on its word description? Technically, this is a type of text analysis, sometimes called sentiment analysis.
He concludes that it is possible to produce a very good prediction, at about 97% precision. I will not bore you with the details, which you can read for yourself by checking out the original. However, it should be obvious from the above graph that simply counting the number of words in a Wine Enthusiast description will allow you to work out the number of points pretty accurately, at least up to a score of 95.
As a last point, we might leave the final word to Jon Cohen, who, when describing wine descriptions (Jabberwiney, 2000), noted: “Why do these people write this way? Is this what happens when your job requires you to drink before noon?”