Monday, April 2, 2018

Artificial intelligence in the wine industry? Not yet, please!

The world is changing rapidly, and the wine industry needs to keep up. However, we should not trip over our own feet in a mad rush to do this. We need to think carefully about just which bits of the modern world will be beneficial, and in what capacities. To this end, I have already written about Big Data, and about the use of social media, and about the vagaries of community wine-quality scores, along with some cautionary tales.

At the risk of becoming a perpetual nay-sayer, I must now say something about Artificial Intelligence (AI). Computing is something I know about, and the potential problems with AI are just a bit too obvious for me to let them pass by unnoticed. Once again, I feel that the enthusiasts are being a bit too enthusiastic, and not quite critical enough for clear thinking. The wine industry deserves better than this.

AI is just what it says — artificial. Whether it is also intelligent I will let you decide for yourselves, below. Human intelligence is sometimes called into question, usually for a good reason, but we should always call artificial intelligence into question.

What is artificial intelligence?

Humans learn by example. Given suitable examples, we can learn to do some pretty impressive things. This is what we mean when we say that human beings are intelligent — we interact with the examples, using trial and error to work out how to do whatever it is that we are trying to do. Sadly, if we are presented with bad examples, we can also learn some pretty bad habits — that is the trade-off, which we have happily accepted.

On the other hand, when we have previously devised machines to aid us in our endeavors, we have designed them to function in very specific ways. The machine does not interact with the world to learn new functions, but instead we have to devise these new functions ourselves and then re-design the machine. Pens dispense ink but cannot learn to compose text; knives cut food but cannot learn to cook that food; and cars cannot learn to fly, even if we add wings to help them do so.

This situation is now changing with the advent of Artificial Intelligence. Computer programs based on AI are not told by humans what to do — they learn by example, not by instruction. That is, they are presented with a collection of examples, plus a programming system that allows them to devise their own behavior from whatever patterns they detect in those examples. This is an example of what is called Machine Learning. It is a probabilistic system — the AI system may not make the same decision each time it meets a new situation, but instead it will have a probability associated with each of several possible behaviors. This is unlike our previous machines, where each machine should repeatedly do the same thing under the same circumstances.

We have very little control over what it is that the AI systems learn — we can only control the actual examples, not what patterns the AI system finds in those examples. If a system learns bad habits, for example, all we can do is keep giving it more and more good examples, and hope that it eventually re-learns. Just like people, right? Indeed, just like any Complex System, the outcome can be unpredictable, as well as uncontrollable.

Let's first look at a few successful examples of AI usage; and then we will look at what sorts of things can go wrong.

Some examples of AI

Perhaps the best-known early application of Artificial Intelligence has been in the matter of designing computer programs to play competitive games, such as chess or poker. Here, the process is relatively straightforward, because the program input is a series of game situations plus their outcome under particular future plays, from which the AI program can deduce the probabilities of success when following any given strategy. The most recent, and most successful, chess example is the AlphaZero program. At the moment, the AI successes are restricted to 2-person games.

Other commonly used examples of AI include the digital "personal assistant" apps, such as Apple's Siri and Amazon's Alexa, along with the predictive film-choosing technology from Netflix and the music-choosing technology from Pandora. In a more modern but less-common vein, predictive self-driving features of Tesla cars are all based on AI. A bit of the history of AI and some other examples are included in The WIRED guide to artificial intelligence.

A not-so-good example (from the wine world)

A classical use of AI is in the Google Translate system, which allows us to translate online text between a wide range of languages. Here, I present a simple example taken from my own experience, in which some text from a Swedish wine site, describing three wines, is being allegedly translated into English.

Original text:
Translated text:

The titles alone tell you that something is wrong, because the translated title makes no sense — it should say "Less than SEK 70". Note that the word "kronor" has successfully been translated in the title — this is the Swedish currency, which would translate literally as "crowns", but SEK is the accepted financial abbreviation.

However, look at the way the other three occurrences of "kronor" have been translated! The text actually has four different translations of this one word, even though the format of the text is unvarying, and all four occurrences should be translated the same way — we have: "SEK", "billion", "$" and "crowns". The first and last are correct translations, but the other two are complete nonsense. Note, especially, the direct translation of Swedish currency to dollars without using an exchange rate — this is not unusual for Google Translate, which is also known to translate "meter" to "foot" without a conversion, for example.

The issue that I am highlighting here is that we cannot ask why the AI system has done this. There is nothing in the programming that tells the system to use any given translation. The system is simply given a large body of text (original text plus a translation), and the algorithm tries to find repeated patterns connecting them. From this deduced information, it makes its probabilistic decisions with each new piece of untranslated text. In this case, Google Translate has learned four different possible translations, and decides which one to use on each occasion.

The only way to correct this problem is to keep providing more and more text (original plus translation), until the system starts to get its decisions right (by finding the correct patterns). We cannot tell it what to do — it is "intelligent", and therefore must work it out for itself.

This solution will eventually work. For example, a couple of years ago the Google translation of Swedish text always ignored the Swedish word "inte". This was a problem because the word translates as "not", which creates the negative of the sentence (see Wikipedia). You can image how silly the translations were, when they said impossible things could happen! Fortunately, Google Translate has now corrected itself (through 2 years' worth of more examples), and "inte" is currently translated correctly.

Along the same lines, if you really would like to see some bizarre translations, try getting Google Translate to convert some Latin text into English (or any other non-Romance language of your choice).

The take-home message

The issue with Artificial Intelligence is this. The old-style approach to computing and machines involved specialization — each machine did one thing only, and did it well. The AI approach to computing and machines involves them being generalists — each of them can do a lot of things, but this risks that they do none of them well. So, in my example, traditional translation systems involve only one pair of languages at a time, and these are translated properly. Google Translate is a system that tries to do all pairwise languages, and at the moment it doesn't do any of them particularly well.

We need to make a choice — we can't have it both ways.

The wine world of AI

So, what are we getting ourselves into, if we bring AI into the wine world? What are people suggesting that we use it for?

Perhaps the most widely touted use of AI in the wine industry is the sort of predictive technology mentioned above for Netflix and Pandora — given certain basic pieces of information about the customer, a computerized assistant should be able to make sensible suggestions regarding wine purchases or food/wine pairing.

This idea is based on having a database of wine information, which is connected by expert knowledge to some sort of consumer "profile". In short, both the wines and the consumers are "profiled" is some way, and the two datasets are connected by an AI system.

This general sort of idea is being (or has been) pushed by a number of companies, producing mobile apps or online sites, such as Next Glass, WineFriend, Hello Vino, Wine Ring, and WineStein. These AI systems usually ask the user a set of questions, and then suggest new wines based on the answers, and possibly also on previous wine consumption.* Wine Ring, for example, has even made it into reports on CNBC and Go-Wine.

This AI approach has also been pushed by some of the social networks, which started out as ways to record what you drink and whether you like it, but have recently morphed into general-purpose wine sites. So, sites such as Vivino now use AI to provide new wine recommendations according to the wines already rated or bought by the consumer. Even Wine-Searcher, which mainly connects consumers with wine prices from an array of retail shops, is testing a recommendation chatbot, called Casey.

This idea may be the least problematic use of AI in the wine industry. It can work well, depending on the quality and quantity of the database containing the wine-related information, and how well it is connected to the customer information. Novices, in particular, can benefit greatly from this use of AI, if it is implemented effectively — but don't be surprised by unpredictable or unexpected wine suggestions, since the AI system itself is dealing with probabilities only. Moreover, speaking as a biologist, the oft-used biological metaphor of the AI database functioning like a "genome" is utter nonsense (see the most popular blog post I have ever written: The Music Genome Project is no such thing).

However, the computational scientists are keen to push these ideas much further. The Google internet search engine is a pretty straightforward implementation of a database search strategy (with a lot of bells and whistles). However, Wolfram Alpha touts itself as a "computational knowledge engine", based on AI — instead of finding a web resource that might contain the answer to a given question (as Google does), it tries to compute the answer from the knowledge in its own databases. It can certainly do some pretty fancy things (see 32 tricks you can do with Wolfram Alpha, the most useful site in the history of the internet). However, if we compare a query for "climate zones" (see last week's post) in each technology — Google returns links to a series of web pages about climate and climate zones (prominently including K√∂ppen's climate classification), whereas Wolfram Alpha returns nothing more than some data about the climate in the town of Zone, Italy. Artificial Intelligence is alright in its place, but we need to understand what that place is, if we are to use it effectively. Horses for courses, as the saying goes.

At the other extreme from simple predictive technology, it has been pointed out that one likely consequence of AI technology is the automation of many tasks currently employing millions of people (Google Chief Economist Hal Varian argues automation is essential). The only real question is whether this will occur sooner or later, not whether it will occur. The point is that, in the past, only repetitive jobs could be automated by machines, but with AI a much winder range of jobs can now be learned by newly designed machines. Self-driving cars are an obvious example, following on from the long-standing use of autopilots in aeroplanes. The issue here is that flying a plane is actually easier to automate than is driving a car!

In the wine industry, as far as autonomous vehicles are concerned, we already have the WineRobot, which wanders the vineyards gathering information about the state of the vines (such as vegetative development, water status, production, and grape composition), just like vineyard managers used to do. We also have the Wall-Ye V.I.N. robot, which carries out the labor-intensive vineyard tasks of pruning and de-suckering; and TED, a robot that neatly weeds between the vineyard rows (for those people who don't use sheep to keep their weeds under control). Other ideas about what is now called Robotic Farming are covered in a short video from the Australian Centre for Field Robotics, at the University of Sydney — a farm is a much safer place for autonomous vehicles than is a public road. [Aside: I learned both my biology and my computing at this university.]

In between these two extremes, the most obvious use of AI systems is likely to involve computerized forecasts, such as early-season vintage forecasts in a vineyard, or sales and price forecasts in a shop. [Note: a forecast is different from a prediction, as I will discuss sometime in a future post.] In these cases, the forecasts are expected to improve through time, as more and more data are gathered, and the AI system continually adjusts itself based on newly found patterns in the data. These forecasts are, thus, adaptive.

It is here that I am most skeptical about the benefits of Artificial Intelligence. My example above of the issues with Google Translate seems to be all too pertinent here. Forecasts are problematic no matter how they are implemented; and AI will not necessarily help. The issues with forecasts lie much deeper than mere "intelligence", with the fact that the future is often so disconnected from the present and the past. The old finance "40% Rule" seems all too apt — one can look like a good forecaster simply by following any proposition with a 40% probability (see How do pundits never get it wrong? Call a 40% chance).

For a selection of other, rather enthusiastic, discussions of AI and wine, see:

* Have you ever noticed that the only two groups who refer to their customers as "users" are the computer industry and the illicit drug industry? I think that this is very revealing.


  1. In computer science, there is this oft-cited saying: "Garbage In, Garbage Out." (GIGO.)


    David writes:

    "If a system learns bad habits, for example, all we can do is keep giving it more and more good examples, and hope that it eventually re-learns. Just like people, right?"

    I am reminded of this egregious example:

    "When It Comes to Gorillas, Google Photos Remains Blind"
    Wired magazine - January 11, 2018


    [Full text appears as the next comment.]

  2. [There is an limitation of 4,096 characters for a comment, so I will excerpt pertinent sections of the above cited article.]

    "When It Comes to Gorillas, Google Photos Remains Blind"
    Wired magazine - January 11, 2018

    "In 2015, A black software developer embarrassed Google by tweeting that the company’s Photos service had labeled photos of him with a black friend as “gorillas.” Google declared itself “appalled and genuinely sorry.” An engineer who became the public face of the clean-up operation said the label gorilla would no longer be applied to groups of images, and that Google was “working on longer-term fixes.”

    "More than two years later, one of those fixes is erasing gorillas, and some other primates, from the service’s lexicon. The awkward workaround illustrates the difficulties Google and other tech companies face in advancing image-recognition technology, which the companies hope to use in self-driving cars, personal assistants, and other products.

    "WIRED tested Google Photos using a collection of 40,000 images well-stocked with animals. It performed impressively at finding many creatures, including pandas and poodles. But the service reported 'no results' for the search terms 'gorilla,' 'chimp,' 'chimpanzee,' and 'monkey.'

    "Google Photos, offered as a mobile app and website, provides 500 million users a place to manage and back up their personal snaps. It uses machine-learning technology to automatically group photos with similar content, say lakes or lattes. The same technology allows users to search their personal collections.

    "In WIRED’s tests, Google Photos did identify some primates. Searches for 'baboon,' 'gibbon,' 'marmoset,' and 'orangutan' functioned well. Capuchin and colobus monkeys could be found as long as a search used those terms without appending the M-word.

    "In another test, WIRED uploaded 20 photos of chimps and gorillas sourced from nonprofits Chimp Haven and the Dian Fossey Institute. Some of the apes could be found using the search terms 'forest,' 'jungle,' or 'zoo,' but the remainder proved difficult to surface.

    "The upshot: Inside Google Photos, a baboon is a baboon, but a monkey is not a monkey. Gorillas and chimpanzees are invisible."

    "In a third test attempting to assess Google Photos’ view of people, WIRED also uploaded a collection of more than 10,000 images used in facial-recognition research. The search term “African american” turned up only an image of grazing antelope. Typing 'black man,' 'black woman,' or 'black person,' caused Google’s system to return black-and-white images of people, correctly sorted by gender, but not filtered by race. The only search terms with results that appeared to select for people with darker skin tones were 'afro' and 'African,' although results were mixed.

    "A Google spokesperson confirmed that 'gorilla' was censored from searches and image tags after the 2015 incident, and that 'chimp,' 'chimpanzee,' and 'monkey' are also blocked today. 'Image labeling technology is still early and unfortunately it’s nowhere near perfect,' the spokesperson wrote in an email, highlighting a feature of Google Photos that allows users to report mistakes.

    "Google’s caution around images of gorillas illustrates a shortcoming of existing machine-learning technology. With enough data and computing power, software can be trained to categorize images or transcribe speech to a high level of accuracy. But it can’t easily go beyond the experience of that training. And even the very best algorithms lack the ability to use common sense, or abstract concepts, to refine their interpretation of the world as humans do."