The Limits of Rating Systems in Healthcare Credence Goods Markets

Last registered on May 23, 2022

View Trial History

Pre-Trial

Trial Information

General Information

Title

The Limits of Rating Systems in Healthcare Credence Goods Markets

RCT ID

AEARCTR-0008572

Initial registration date

November 15, 2021

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

November 18, 2021, 11:58 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated

May 23, 2022, 7:39 AM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Country

Austria

Region

Primary Investigator

Name

Thomas Rittmannsberger

Affiliation

Technical University of Munich, School of Management

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Daniela Glätzle-Rützler

PI Affiliation

University of Innsbruck

Contact Investigator

PI Name

Silvia Angerer

PI Affiliation

UMIT Tirol

Contact Investigator

PI Name

Christian Waibel

PI Affiliation

ETH Zurich

Contact Investigator

PI Name

Wanda Mimra

PI Affiliation

ESCP Business School

Contact Investigator

Additional Trial Information

Status

In development

Start date

2022-05-24

End date

2022-12-31

Keywords

Behavior, Health, Lab

Additional Keywords

Feedback, Rating Systems, Credence Goods

JEL code(s)

C91

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

A key characteristic of health care markets is the information asymmetry between patients and physicians. Physicians know more about the disease and the appropriate treatment than patients. This may result in different forms of physician misbehavior: providing more treatments than necessary, i.e. overtreatment; providing less treatment than necessary, i.e. undertreatment or charging more treatments than provided, i.e. overcharging. Patients have to trust in physicians that they receive appropriate treatment. This is why health services are often referred to as credence goods (Darby and Karni 1973, Dulleck and Kerschbamer 2006).

The provision of feedback on rating platforms and the associated reputation building has gained more and more attention in the past two decades in the context of physician-patient interactions. In Germany, for instance, about 70% of physician-rating website users are influenced by the rating in their physician choice (Emmert and Meszmer 2018). However, patients base their ratings often on characteristics unrelated to the quality of care (Emmert et al. 2020), thus introducing noise into the quality ratings. We capture these recent developments and investigate the effectiveness of public rating systems on the quality of care with the use of a laboratory experiment.

Based on the credence goods framework established by Dulleck and Kerschbamer (2006) and Dulleck et al. (2011), we introduce a toy model that enables us to derive hypotheses and test them in a laboratory experiment. We are planning to run at least four conditions of market interactions with 48 undergraduate students either in the role of physicians or patients. In the baseline condition, no reputation building is possible between physicians and patients. In the rating conditions, we introduce the possibility to rate physicians on a rating scale between zero and five stars. The rating is based on the payoff information of patients resulting from the interaction between physician and patient. In the (2+) random-rating conditions, on top of the ratings provided by patients, we add noise to the average rating publicly visible to all market participants by introducing additional random ratings between 0 and 5 stars for each rating provided by patients.

Our design allows us to investigate the effect of a public rating mechanism on outcomes in healthcare credence goods markets. Furthermore, it enables us to explore the robustness of public rating mechanisms to noise by introducing additional random ratings.

References
Darby, M. R. and E. Karni (1973). "Free Competition and the Optimal Amount of Fraud." Journal of Law & Economics 16(1): 67-88.
Dulleck, U. and R. Kerschbamer (2006). "On Doctors, Mechanics, and Computer Specialists: The Economics of Credence Goods." Journal of Economic Literature 44(1): 5-42. DOI: https://doi.org/10.1257/002205106776162717.
Dulleck, U., R. Kerschbamer and M. Sutter (2011). "The Economics of Credence Goods: An Experiment on the Role of Liability, Verifiability, Reputation, and Competition." American Economic Review 101(2): 526-555. DOI: https://doi.org/10.1257/aer.101.2.526.
Emmert, M., S. Becker, N. Meszmer and U. Sander (2020). "Spiegeln Facebook-Bewertungen Die Versorgungsqualität Und Patientenzufriedenheit Von Krankenhäusern Wider? Eine Querschnittstudie Am Beispiel Der Geburtshilfe in Deutschland." Gesundheitswesen 82(06): 541-547. DOI: https://doi.org/10.1055/a-0774-7874.
Emmert, M. and N. Meszmer (2018). "Eine Dekade Arztbewertungsportale in Deutschland: Eine Zwischenbilanz Zum Aktuellen Entwicklungsstand." Gesundheitswesen 80(10): 851-858. DOI: https://doi.org/10.1055/s-0043-114002.

External Link(s)

Registration Citation

Citation

Angerer, Silvia et al. 2022. "The Limits of Rating Systems in Healthcare Credence Goods Markets." AEA RCT Registry. May 23. https://doi.org/10.1257/rct.8572-2.0

Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Experimental Details

Interventions

Intervention(s)

We experimentally investigate the effect and limits of a public rating system in healthcare credence goods markets. Therefore, we plan to employ a laboratory experiment framed in a healthcare context, where experts are called physicians and consumers are called patients, using a student sample from the University of Innsbruck.

We plan to run at least four experimental conditions. In the baseline condition, there is no feedback mechanism in place. Next, we introduce a public rating mechanism into the market, where patients can rate the interactions with physicians on a five-star-rating-scale. Given that the feedback mechanism enhances market outcomes, we plan to run at least two follow-up conditions where we introduce noise into the feedback mechanism. We plan to implement noise as a situation, where physicians receive random ratings (from zero to five stars) on top of each patient rating. The conditions with noise vary in the amount of additional ratings. We will start with one random rating for each patient rating and — depending on its’ effect on market outcomes — will increase (decrease) the amount of noise (i.e. the number of random ratings) in the following condition(s).

Our design allows us to investigate the robustness of public rating mechanisms to noise by introducing additional random ratings.

Intervention Start Date

2022-05-24

Intervention End Date

2022-07-15

Primary Outcomes

Primary Outcomes (end points)

Overtreatment-rates

Primary Outcomes (explanation)

Overtreatment is characterized by the fact that the patient needs the mild treatment (𝑞l) but receives the major treatment (𝑞H). The overtreatment rate is the number of actual overtreatment decisions divided by the number of interactions with patients with a mild health problem.

Secondary Outcomes

Secondary Outcomes (end points)

Market Efficiency as the sum of patient, physician and insurance payoffs.

Secondary Outcomes (explanation)

Experimental Design

We plan to use a student sample from the University of Innsbruck and run each experimental condition with 48 subjects (as suggested by our power analysis). Therefore, we plan to run two sessions with 24 subjects each in every experimental condition. All sessions are run computerized using z-Tree and students are recruited using hroot. Participants do not know which experiment they are going to participate in when they register. They only receive information about the expected duration of the experiment (1:45h).

Our experiment is structured as follows for all our conditions:
Stage 1: The experimenter explains the experiment and participants read the instructions.
Stage 2: Participants answer several control questions to ensure they understood the game.
Stage 3: The computer randomly assigns roles and markets to participants.
Stage 4: Participants play the game for 16 periods.
Stage 5: Participants participate in additional games: an individual risk preference task, a dictator game, a lying task, and a trust game.
Stage 6: Participants fill out a questionnaire.

Experimental Design Details

Randomization Method

Randomization is carried out in the experiment by a computer.

Randomization Unit

at the session level

Was the treatment clustered?

Yes

Experiment Characteristics

Sample size: planned number of clusters

6 clusters á 8 individuals per experimental condition.

Sample size: planned number of observations

48 (6 x 8) individuals per experimental condition.

Sample size (or number of clusters) by treatment arms

at least 192 (4 x 48) individuals (students at the University of Innsbruck).

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Based on previous findings, we performed a power calculation, indicating that we need six clusters á 8 subjects per experimental condition when aiming for a power of 80%.

Supporting Documents and Materials

IRB