x

Please fill out this short user survey of only 3 questions in order to help us improve the site. We appreciate your feedback!
To the mean or not? How people aggregate customer ratings – An experimental analysis
Last registered on June 16, 2020

Pre-Trial

Trial Information
General Information
Title
To the mean or not? How people aggregate customer ratings – An experimental analysis
RCT ID
AEARCTR-0003656
Initial registration date
December 09, 2018
Last updated
June 16, 2020 4:44 AM EDT
Location(s)

This section is unavailable to the public. Use the button below to request access to this information.

Request Information
Primary Investigator
Affiliation
Paderborn University
Other Primary Investigator(s)
PI Affiliation
Paderborn University
Additional Trial Information
Status
On going
Start date
2016-12-19
End date
2021-06-30
Secondary IDs
Abstract
When faced with purchase decisions in online markets the only independent source of information about product quality are customer reviews since they are not provided by producers nor vendors. Hence, customers tend to trust them and make their purchase decision on the ground of the review on a regular basis. Thereby customers usually do not read every customer review but use, aggregated measurements such as rating distributions or a transformed scalar value, usually the arithmetic mean of all customer reviews, provided by the online seller.
Considering anecdotal evidence, we doubt that the arithmetic mean is a proper aggregation mechanism. Hence, we conduct a lab experiment to investigate whether customers aggregate ratings by using the arithmetic mean or by other decision heuristics. Thereby subjects receive customer rating distributions of three products and are asked to rank the products according to their preferences. Participants just see the distributions, but do not know the products’ name nor detailed specifications. Overall, subjects make twelve ranking decisions.
Since we partially use real aggregated customer ratings associated to real products from Amazon Marketplace, we are able to incentivize the decisions. Subjects receive an USB flash drive they rank first or second as payment and, in addition, have the chance to win another product they chose from one of the other product categories.
We employ two treatments to investigate whether the ability of calculating the arithmetic mean is decisive: In the Baseline Treatment, subjects received only the aggregated customer ratings without any additional information. In the Information Treatment, subjects see in addition to the ratings the relative frequency of each of the 5-star categories and the value of the arithmetic mean associated to the distribution.
Besides standard analysis, we used novel maximum likelihood approaches from machine learning. In particular, we fit a Plackett-Luce model to investigate which strategies are employed.
External Link(s)
Registration Citation
Citation
Djawadi, Behnud and Dirk van Straaten. 2020. "To the mean or not? How people aggregate customer ratings – An experimental analysis." AEA RCT Registry. June 16. https://doi.org/10.1257/rct.3656-1.2000000000000002.
Former Citation
Djawadi, Behnud, Dirk van Straaten and Dirk van Straaten. 2020. "To the mean or not? How people aggregate customer ratings – An experimental analysis." AEA RCT Registry. June 16. http://www.socialscienceregistry.org/trials/3656/history/70539.
Sponsors & Partners

There are documents in this trial unavailable to the public. Use the button below to request access to this information.

Request Information
Experimental Details
Interventions
Intervention(s)
Intervention Start Date
2018-12-11
Intervention End Date
2018-12-12
Primary Outcomes
Primary Outcomes (end points)
Ranking of customer rating distributions
Primary Outcomes (explanation)
Secondary Outcomes
Secondary Outcomes (end points)
Secondary Outcomes (explanation)
Experimental Design
Experimental Design
We conduct a randomized controlled trial with a control group (baseline). In both groups, we run twelve periods containing three customer rating distributions each. For the purpose of incentivization we use real customer rating distributions in three periods whereby participants have the chance to win the associated products. These customer rating distributions have the unintended characteristic that at least one stochastically dominates at least one distribution out of this decision set. In the other periods subjects rank artificial customer rating distributions that have the feature of no (first-order) stochastic dominance.
Since it is crucial for our machine learning approach and its underlying assumptions,we plan to exclude decisions that contain stochastically dominated customer rating distributions for this part of analysis.
Experimental Design Details
Not available
Randomization Method
Randomization of the decision sets is done by a computer. E.g., decision set A is ranked by participant x in period k and by participant y in period f. In addition, the order the distributions are displayed (i.e., left, middle, right) is randomized by the computer.
Randomization Unit
individual
Was the treatment clustered?
No
Experiment Characteristics
Sample size: planned number of clusters
2 treatments: baseline (only customer rating distributions displayed) and information treatment (additionally, relative frequencies and the arithmetic mean displayed)
Sample size: planned number of observations
112 participants
Sample size (or number of clusters) by treatment arms
56 participants per treatment
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB
INSTITUTIONAL REVIEW BOARDS (IRBs)
IRB Name
IRB Approval Date
IRB Approval Number