AEA RCT Registry

Fields Changed

Registration

Field	Before	After
Trial Title	To the mean or not? How people aggregate customer ratings – An experimental analysis	Accounting for Heuristics in Reputation Systems: An Interdisciplinary Approach on Aggregation Processes
Abstract	When faced with purchase decisions in online markets the only independent source of information about product quality are customer reviews since they are not provided by producers nor vendors. Hence, customers tend to trust them and make their purchase decision on the ground of the review on a regular basis. Thereby customers usually do not read every customer review but use, aggregated measurements such as rating distributions or a transformed scalar value, usually the arithmetic mean of all customer reviews, provided by the online seller. Considering anecdotal evidence, we doubt that the arithmetic mean is a proper aggregation mechanism. Hence, we conduct a lab experiment to investigate whether customers aggregate ratings by using the arithmetic mean or by other decision heuristics. Thereby subjects receive customer rating distributions of three products and are asked to rank the products according to their preferences. Participants just see the distributions, but do not know the products’ name nor detailed specifications. Overall, subjects make twelve ranking decisions. Since we partially use real aggregated customer ratings associated to real products from Amazon Marketplace, we are able to incentivize the decisions. Subjects receive an USB flash drive they rank first or second as payment and, in addition, have the chance to win another product they chose from one of the other product categories. We employ two treatments to investigate whether the ability of calculating the arithmetic mean is decisive: In the Baseline Treatment, subjects received only the aggregated customer ratings without any additional information. In the Information Treatment, subjects see in addition to the ratings the relative frequency of each of the 5-star categories and the value of the arithmetic mean associated to the distribution. Besides standard analysis, we used novel maximum likelihood approaches from machine learning. In particular, we fit a Plackett-Luce model to investigate which strategies are employed.	Aggregation metrics in reputation systems are important for overcoming information overload. When using these metrics, technical aggregation functions such as the arithmetic mean are implemented to measure the valence of product ratings. However, it is unclear whether the implemented aggregation functions match the inherent aggregation patterns of customers. In our experiment, we elicit customers’ aggregation heuristics and contrast these with reference functions. Our findings indicate that, overall, the arithmetic mean performs best in comparison with other aggregation functions. However, our analysis on an individual level reveals heterogeneous aggregation patterns. Major clusters exhibit a binary bias (i.e., an over-weighting of moderate ratings and under-weighting of extreme ratings) in combination with the arithmetic mean. Minor clusters focus on 1-star ratings or negative (i.e., 1-star and 2-star) ratings. Thereby, inherent aggregation patterns are neither affected by variation of provided information nor by individual characteristics such as experience, risk attitudes, or demographics.
Trial End Date	June 30, 2021	December 31, 2021
JEL Code(s)	P46 L86 C50 C91	D12 D81 C91
Last Published	June 16, 2020 04:44 AM	June 17, 2021 05:31 AM
Additional Keyword(s)	Aggregation, Customer ratings, Individual decison-making,	customer reviews, aggregation, heuristics, binary bias, arithmetic mean,
Keyword(s)	Other	Other