Predicting compliance: Leveraging chat data for supervised classification in experimental research

Last registered on January 26, 2024

Pre-Trial

Trial Information

General Information

Title
Predicting compliance: Leveraging chat data for supervised classification in experimental research
RCT ID
AEARCTR-0005049
Initial registration date
November 20, 2019

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
November 20, 2019, 2:54 PM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
January 26, 2024, 8:24 AM EST

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
ETH Zurich

Other Primary Investigator(s)

PI Affiliation
Freie Universität Berlin
PI Affiliation
Freie Universität Berlin

Additional Trial Information

Status
Completed
Start date
2019-11-26
End date
2020-05-26
Secondary IDs
Prior work
This trial is based on or builds upon one or more prior RCTs.
Abstract
Behavioral and experimental economics have conventionally employed text data to facilitate the interpretation of decision-making processes. This paper introduces a novel methodology, leveraging text data for predictive analytics rather than mere explanation. We detail a supervised classification framework that interprets patterns in chat text to estimate the likelihood of associated numerical outcomes. Despite the unique advantages of experimental data in correlating textual and numerical information for predictive modeling, challenges such as limited sample sizes and potential data skewness persist. To address these, we propose a comprehensive methodological framework aimed at optimizing predictive modeling configurations, particularly in small experimental behavioral research datasets. We also present behavioral experimental data from a preregistered tax evasion game (n=324), demonstrating that chat behavior is not influenced by experimenter demand effects. This establishes chat text as an unbiased variable, enhancing its validity for prediction. Our findings further indicate that beliefs about others’ dishonesty, lying attitudes, and risk preferences significantly impact compliance decisions.
External Link(s)

Registration Citation

Citation
Hausladen, Carina Ines, Martin Fochmann and Peter Mohr. 2024. "Predicting compliance: Leveraging chat data for supervised classification in experimental research." AEA RCT Registry. January 26. https://doi.org/10.1257/rct.5049-1.5
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
For the out-of-sample performance of the pre-trained classifier, out-of-context chat-data is collected.
In this new online-experiment, participants work for a fictitious company in pairs of two.
In an online chat, they discuss which number of surplus hours they want to state.
Groups are controlled if reports do differ and/or if their group is one of the 30% percent of randomly controlled groups in each session.
If the inspections show that an individual's report was not truthful, (s)he needs to pay a fine.
Intervention Start Date
2020-05-19
Intervention End Date
2020-05-26

Primary Outcomes

Primary Outcomes (end points)
The reported amount of surplus hours by each participant. The group chat between two members of a group.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Participants work for a fictive company in groups of two.
Both group members are informed about the surplus hours they worked.
They subsequently get the opportunity to chat about the amount of surplus hours they want to state.
The reports are controlled if the group members' reports differ and / or if the group is one of the 30 percent of randomly chosen groups to be controlled.
If a group is controlled and the number of stated surplus hours is not the same as the actually worked surplus hours, each group member as to pay a fine.
Experimental Design Details
Randomization Method
The experimental setting does not involve treatment. In order to minimize waiting-times in the online-experiment, participants are grouped by the time they log in to the experiment.
Randomization Unit
In each session, 30 percent of the groups are randomly chosen to be controlled.
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters
100 groups
Sample size: planned number of observations
200 participants
Sample size (or number of clusters) by treatment arms
This experiment does not involve treatments.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
German Association for Experimental Economic Research e.V.
IRB Approval Date
2019-11-19
IRB Approval Number
EUFf7PP5

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
Yes
Intervention Completion Date
May 26, 2020, 12:00 +00:00
Data Collection Complete
Yes
Data Collection Completion Date
May 26, 2020, 12:00 +00:00
Final Sample Size: Number of Clusters (Unit of Randomization)
162 groups
Was attrition correlated with treatment status?
No
Final Sample Size: Total Number of Observations
324 participants
Final Sample Size (or Number of Clusters) by Treatment Arms
162 groups, 324 participants
Data Publication

Data Publication

Is public data available?
Yes

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Program Files

Program Files
Yes
Reports, Papers & Other Materials

Relevant Paper(s)

Abstract
Behavioral and experimental economics have conventionally employed text data to facilitate the interpretation of decision-making processes. This paper introduces a novel methodology, leveraging text data for predictive analytics rather than mere explanation. We detail a supervised classification framework that interprets patterns in chat text to estimate the likelihood of associated numerical outcomes. Despite the unique advantages of experimental data in correlating textual and numerical information for predictive modeling, challenges such as limited sample sizes and potential data skewness persist. To address these, we propose a comprehensive methodological framework aimed at optimizing predictive modeling configurations, particularly in small experimental behavioral research datasets. We also present behavioral experimental data from a preregistered tax evasion game (n=324), demonstrating that chat behavior is not influenced by experimenter demand effects. This establishes chat text as an unbiased variable, enhancing its validity for prediction. Our findings further indicate that beliefs about others' dishonesty, lying attitudes, and risk preferences significantly impact compliance decisions.
Citation
Hausladen, Carina I., Martin Fochmann, and Peter Mohr. "Predicting compliance: Leveraging chat data for supervised classification in experimental research." Journal of Behavioral and Experimental Economics (2024): 102164.

Reports & Other Materials