The short answer appears to be: not very often. This is surprising, given what is reported for other communities. This may indicate something unique about the wine community.
A few weeks ago, I discussed community wine-quality scores, such as those in the Cellar Tracker database (Cellar Tracker wine scores are not impartial). One of the subjects I commented on was the suggestion that the "wisdom of crowds" can mean that members of the crowd allow their judgement to be skewed by their peers. In the case of wine-quality scores, this would mean that scores from large groups of tasters may converge towards the middle ground, as the number of scores increases.
In the formal literature, this topic has been examined by, for example, Omer Gokcekus, Miles Hewstone & Huseyin Cakal (2014. In vino veritas? Social influence on ‘private’ wine evaluations at a wine social networking site. American Association of Wine Economists Working Paper No. 153). They looked at the trend in Cellar Tracker scores for wines through time, from when the first score is added for each wine. They wanted to see whether the variation in scores for a wine decreases as more scores are added for that wine, which would support the thesis about crowd behavior. They concluded that there is some evidence of this.
The important practical point here is that Cellar Tracker displays the average score for each wine when a user tries to add a new score of their own, and it is hard to ignore this information. So, it would be rather easy for a user to be aware of the difference between their own proposed score and the current "wisdom of the crowds". This would presumably have little or no effect when only a few scores have been added for each wine, but it might potentially have an effect as more score are added, because the crowd opinion then becomes so much clearer.
It has occurred to me that some data that I used in another blog post (Are there biases in community wine-quality scores?) might also be used to examine the possibility that Cellar Tracker scores are biased in this way. In my case, I will look at individual wines, rather than pooling the data across all wines, as was done in the research study described above.
The data at hand are the publicly available scores from Cellar Tracker for eight wines (for my data, only 55-75% of the scores were available as community scores, with the rest not being shared by the users). These eight wines included red wines from several different regions, a sweet white, a still white, a sparkling wine, and a fortified wine. In each case I searched the database for a wine with at least 300 community scores; but I did not succeed for the still white wine (which had only 189 scores).
The results for the eight wines are shown in the graphs at the end of the post. Each point represents one quality score for the wine (some users enter multiple scores through time). For each wine, each score is shown (vertically) as the difference from the mean score for the wine — positive scores indicate that score was greater than the average score, while negative scores indicate that it was less than the average. The time is shown (horizontally) as the number of days after the first tasting recorded for that wine.
The expectation is that, if the wine-quality scores do converge towards the middle ground, then the variability of the scores should decrease through time. That is, the points in the graphs will be more spread out vertically during the earliest times, compared to the later times.
The results seem to be quite consistent, with one exception. That exception is the first one, where the scores are, indeed, more variable through the first third of the time period. In all of the other cases, the scores are most variable during the middle period, which is when most of the scores get added to the database, or sometimes also in the late period.
So, for these wines at least, I find little evidence that Cellar Tracker scores do converge towards the middle ground. This seems to disagree with the study of Gokcekus, Hewstone & Cakal (mentioned above), who concluded that community scores are normative (= "to conform with the positive expectations of another") rather than informational ("to accept information obtained from another as evidence about reality").
However, a study by Julian McAuley & Jure Leskovec (2013. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. Proceedings of the 22nd International Conference on the World Wide Web, pp. 897-908), found that user behavior in the Cellar Tracker database was quite different from the other four community databases that they studied (Beer Advocate; Rate Beer; Amazon Fine Foods; Amazon Movies).
So, maybe wine drinkers really are different from beer drinkers and movie goers, when it comes to community assessment of their products? The wisdom of the wine crowd may be unique! In particular, you will note that wine drinkers are not afraid to give rather low scores for wines — the scores in the graphs go much further below the average than they do above it. Note that the dataset excludes wines that are considered to be flawed, which are usually not given scores at all (although very rarely they receive scores in the 50-60 range, which I excluded, as representing faulty wines).
It seems to me that community wine scores are actually informational, rather than normative, expressing the opinion of the drinker rather than that of the crowd. This also fits in with the easily observed fact that the community scores are consistently lower than are those of the professional wine critics (see my previous post Cellar Tracker wine scores are not impartial) — the wine community is not easily swayed by expert opinion. However, the tendency of all wine reviewers, professional, semi-professional and amateur, to favor a score of 90 over a score of 89 certainly represents an unfortunate bias.