Field
Abstract
|
Before
When faced with purchase decisions in online markets the only independent source of information about product quality are customer reviews since they are not provided by producers nor vendors. Hence, customers tend to trust them and make their purchase decision on the ground of the review on a regular basis. Thereby customers usually do not read every customer review but use, aggregated measurements such as rating distributions or a transformed scalar value, usually the arithmetic mean of all customer reviews, provided by the online seller.
Considering anecdotal evidence, we doubt that the arithmetic mean is a proper aggregation mechanism. Hence, we conduct a lab experiment to investigate whether customers aggregate ratings by using the arithmetic mean or by other decision heuristics. Thereby subjects receive customer rating distributions of three products and are asked to rank the products according to their preferences. Participants just see the distributions, but do not know the products’ name nor detailed specifications. Overall, subjects make twelve ranking decisions.
Since we partially use real aggregated customer ratings associated to real products from Amazon Marketplace, we are able to incentivize the decisions. Subjects receive an USB flash drive they rank first or second as payment and, in addition, have the chance to win another product they chose from one of the other product categories.
We employ two treatments to investigate whether the ability of calculating the arithmetic mean is decisive: In the Baseline Treatment, subjects received only the aggregated customer ratings without any additional information. In the Information Treatment, subjects see in addition to the ratings the relative frequency of each of the 5-star categories and the value of the arithmetic mean associated to the distribution.
Besides standard analysis, we used novel maximum likelihood approaches from machine learning. In particular, we fit a Plackett-Luce model to investigate which strategies are employed.
|
After
Aggregation metrics in reputation systems are important for overcoming information overload. When using
these metrics, technical aggregation functions such as the arithmetic mean are implemented to measure the
valence of product ratings. However, it is unclear whether the implemented aggregation functions match the
inherent aggregation patterns of customers. In our experiment, we elicit customers’ aggregation heuristics
and contrast these with reference functions. Our findings indicate that, overall, the arithmetic mean performs
best in comparison with other aggregation functions. However, our analysis on an individual level reveals
heterogeneous aggregation patterns. Major clusters exhibit a binary bias (i.e., an over-weighting of moderate
ratings and under-weighting of extreme ratings) in combination with the arithmetic mean. Minor clusters
focus on 1-star ratings or negative (i.e., 1-star and 2-star) ratings. Thereby, inherent aggregation patterns
are neither affected by variation of provided information nor by individual characteristics such as experience,
risk attitudes, or demographics.
|
Field
Additional Keyword(s)
|
Before
Aggregation, Customer ratings, Individual decison-making,
|
After
customer reviews, aggregation, heuristics, binary bias, arithmetic mean,
|