Monday, February 26, 2018

Social bots and the problems they create for wine marketing

In the 21st century, anyone involved in advertising or selling products needs to be savvy with regard to social media; and this includes the wine industry. Contact with customers via social media (Facebook, Twitter, Instagram, Pinterest, blogs, etc) has not replaced traditional forms of contact (shops, tasting rooms, print reviews, etc), but it is definitely a major new form of interaction.

Take this 2010 quote from Patrick Goldstein:
Virtually every survey has shown that younger audiences have zero interest in critics. They take their cues for what movies to see from their peers, making decisions based on the buzz they've heard on Facebook, Twitter or some other form of social networking.

There have many discussions of social media and its commercial use, both in and outside the wine industry; and much of this is overtly enthusiastic and uncritical. We are simply told that the future is already here, with the availability of Big Data. If nothing else, we are told, we may be able to avoid the seemingly endless layers of "middle men" standing between the producer and the customer.

In an attempt to provide a somewhat more temperate discussion, I have already provided one blog post about the limitations of social media in the wine industry (The dangers of over-interpreting Big Data); and I have also noted that community wine-quality scores are no more impartial than are scores from individuals (Are there biases in community wine-quality scores?).

Here, as another sobering thought, I discuss a further issue that seems to me to be of importance, but which has not received much obvious attention, at least in the wine industry. This is the matter of what are known as social robots, or usually just "bots" for short. Like all human developments, bots can be exploited as well as used responsibly; and we need to understand the consequences of their possible misuses, if we are going to use social media effectively in the wine industry.


Bots have existed since the beginning of computing. They are simply computer programs that were originally developed to take care of the computer house-keeping when the volume (or speed) of activity gets too much for humans to handle.

To this end, they have increased dramatically in number since the advent of the internet. For example, the most prevalent of the so-called "good bots" are the web crawlers and scanners — every web search engine (Google, Yahoo, Bing, DuckDuckGo, Yandex, etc) has a mass of bots crawling the web, gathering data for use in the databased indexes that make speedy web searches possible.

Social robots, on the other hand, operate in the social media, and therefore potentially interact directly with human beings. The good news is that they can address some of the potentially overwhelming aspects of dealing with Big Data (ie. thousands of Facebook pages, tens of thousands of Instagram pictures, millions of Twitter tweets, etc). Let's start with a couple of obvious examples of potentially useful social bots, just to set the scene:
  • trading bots are involved in the automatic buying and selling of investment stocks, shares and cryptocurrencies — see Nathan Reiff (December 2017)
  • bots are also involved in the automatic buying of online entertainment tickets — see Donna Fuscaldo (March 2017).
Unfortunately, on the other side of the coin we have the so-called "bad bots", which can seriously disrupt human activities. For example, it has been suggested (along with considerable evidence) that the erratic price of cryptocurrencies in recent times has, at least partly, been manipulated by the activities of certain trading bots (eg. those named Spoofy and Picasso) — see Brian Yahn (January 2018).

Bots have, of course, also become prevalent in the world of blogs, Facebook, Instagram and Twitter, and their ilk; and here we potentially have widespread problems. In particular, these bots can wreak havoc with any attempts to make use of social media data for economic purposes. The Big Data ends up being massively misleading, because the web metrics being measured are inflated by the bots' activities, in unpredictable ways. I have discussed the important issue of such biased data before (Why do people get hung up about sample size?).

Problems with bots

According to the 2016 Imperva Incapsula Bot Traffic Report, c. 48% of web traffic is by humans, 23% is by good bots, and 29% is by bad bots; so we are not talking about a small problem. To look at the activity of some of the bad bots, let's take blogs first.


I first became aware of the scourge of bots with my professional blog, The Genealogical World of Phylogenetic Networks. I used to keep track of the number of visitors to that site, but this has now become a worthless activity, simply because of the number of bots that make visits to the blog's pages. When I see 2,000 visits from the Ukraine in a couple of days, I know that I am not seeing visits from large numbers of English-speaking Ukrainian scientists! Instead, I am seeing referral spam, from so-called "spam bots". Even the blog you are currently reading is prone to get 200 visits from Russia on some days.

These bots are trying to create referral traffic from other web sites, so that Google and similar search engines will record their visits, and thus increase the Page Rank of the referring site. From the point of view of the blogger, these spambot visits completely distort the blog's Analytics Referral Data, which is one measure of the success of the blog as part of the world's social media network.

In other words, these bots stuff up the Big Data. According to the above-mentioned Bot Traffic Report, bot traffic depends on the "size" of the visited site, in terms of the number of daily visits from humans. For my professional blog, the report estimates that visits are likely to be about 25% good bots and 45% bad bots. I would not disagree with those estimates — that is, only 30% of the visits are from actual human beings, who might be reading the blog posts. Sadly, it seems to be impossible to get rid of bot traffic from Blogger blogs.

This is a similar (but distinct) issue to that faced by marketers when bots generate "clicks" on online ads, thus artificially increasing traffic to the advertised site. One third of all digital advertising is suspected to be fraudulent, in this sense. For more information, see: A 'crisis' in online ads; and Google issuing refunds to advertisers over fake traffic.


Moving on to Facebook, there are thousands of bots; see Facebook Messenger has 11K bots: should you care? Most of these are so-called "chat bots", which are supposed to function as some sort of personal assistant for users, helping to gather information (eg. aggregate content from various online sources, such as news feeds) and/or conduct e-commerce transactions. They try to keep the users interacting within the Facebook environment, rather than having them leave to use another computer program (eg. in order to access content or conduct transactions).

This is all well and good, but what about the bad bots? These have become very obvious over the past half-dozen years, and they try to emulate the behavior of humans, and possibly alter the humans' behavior. They do this through the use of fake identities within the social media world, which is rapidly becoming big news.

Several studies of social bot infiltration of Facebook (eg. Krombholz et al., Fake identities in social media: a case study on the sustainability of the Facebook business model) have shown that more than 20% of legitimate users will accept "friendship" requests indiscriminately, and that more than 60% will automatically accept requests from accounts with at least one contact in common. This makes it very easy to use fake identities for any purpose whatsoever, including the false appearance of social media popularity and influence.

It has therefore been obvious for some years that people have been Buying followers on social media sites. As noted above, this completely alters the social media analytics, and reduces the usefulness of the Big Data. The extent of this problem was recently discussed in The New York Times (The follower factory), which noted:
The Times reviewed business and court records showing that Devumi [a well-known "follower factory"] has more than 200,000 customers, including reality television stars, professional athletes, comedians, TED speakers, pastors and models. In most cases, the records show, they purchased their own followers. In others, their employees, agents, public relations companies, family members or friends did the buying.
It matters not how many Facebook friends you have for your winery or wines — instead, we must ask: how many of them are real? Facebook "likes" may not be worth much, any more. Sophisticated bots can create personas that appear to be very credible followers, and they thus are very hard for both people and automated filtering algorithms to detect (see Varol et al., Online human-bot interactions: detection, estimation, and characterization).


Moving on to Twitter now, it has been observed that Twitter is an ideal environment for bots. Early social media bots were mainly designed for the automatic posting of content, and Twitter is the most effective place for that; see Twitter may have 45 million bots on its hands. Estimates of the number of bots on Twitter are 10-15% of the accounts.

So, in addition to the fake-identity problem outlined above, Twitter has an extra, very large problem — the rapid spread of misleading information (see Shao et al., The spread of fake news by social bots). As Ferrara et al. (The rise of social bots) have noted:
These bots mislead, exploit, and manipulate social media discourse with rumors, spam, malware, misinformation, slander, or even just noise. This may result in several levels of damage to society.
It is obvious that emotions are contagious in the social media; and Twitter bots seem to be particularly active in the early spread of viral claims, hoaxes, click-bait headlines, and fabricated reports. A recent article from the Media Insight Project discusses How millennials get news: inside the habits of America’s first digital generation, and it is now clear that the social media are of prime importance. So, the contagious spread of emotive false news is a really big issue.

As an aside, it is worth pointing out that this Twitter phenomenon is not actually new, it is simply magnified these days. In the old days, it was the internet newsgroups that were the primary online mechanism for spreading commentary. One classic example of their effect was the furore that arose over the 1994 release of the original flawed Intel Pentium microprocessor (see The Pentium Chip story: an internet learning experience). Intel did not anticipate the speed of the news spread, nor deal with it effectively.

However, our primary concern in this blog post is with the serious alteration of social media analytics that comes from the presence of Twitter bots. What worth is the following of your wine or winery on Twitter? How much of it comes from automated accounts? Once again, the use of Big Data becomes problematic when we cannot rely on its authenticity.


Instagram is apparently the favorite social media of many wine professionals. However, this post is already long enough, so I will skip any detailed discussion here. Instead, you can read: How bots are inflating Instagram egos. The issue is the basically the same thing I have been discussing — biased metrics arise from inflated "likes" created by bad bots.


As far as the adoption of social media is concerned, I feel that we are still being given the hard sell. To take an analogy, it is like we are being told to buy "a quality used car" — but what sort of quality? Good quality or poor quality? High quality or low quality? Everything has some sort of quality!

We need to think critically about both the pros and the cons of the social media and its associated Big Data. Enthusiasm is all very well, in its place, but it cannot substitute for careful thought about how we use social media in the wine industry. I don't think that the social media gurus have come to terms with bots yet, in terms of analyzing Big Data. What use is Big Data that have been biased by the behavior of bots?

In particular, it seems that the most important practical role for social media is that it can help publicize the existence of companies and their products or services. This makes it an information channel; but this does not necessarily make it a sales channel. We need to keep these two ideas distinct. Bots are not necessarily a problem for the mere advertising of a product, because we do not need to measure web metrics, which they can distort. But selling is a different matter, because we need to assess how effective is the reach of social media, in terms of successful sales (see Social media’s disappointing numbers game). Here, bots are potentially a serious problem.


A number of the ideas here, and some of the information, came from discussions with Bob Henry, who also directed me to some of the online literature.

1 comment: