AEA RCT Registry

Advanced Search

Back to History

Fields Changed

Registration

Field	Before	After
Trial Title	Predicting (dis-)honesty: Leveraging text classification for behavioral experimental research	Predicting compliance: Leveraging chat data for supervised classification in experimental research
Abstract	A lot of laboratory experiments in the field of behavioral economics require participants to chat with each other. Very often the chat is incentivized such that it is directly related to a more easily measurable variable, e.g., the amount paid to a public good or the reported number of a tossed die roll. If this relationship exists, the resulting data is gold-standard labeled data. Consequently, training a supervised machine learning classifier that learns the relationship between text and (numerical) output is a promising approach. This paper describes how we trained, based on chat texts obtained from a tax evasion experiment, a classifier to predict whether a group reported (taxable) income honestly or not. Before this classifier is leveraged for future studies, its generalisability needs to be assessed. Therefore, we designed an experiment, which alters the initial honesty framework with respect to three major dimensions: Firstly, the context is no longer a tax evasion setting, but participants are asked to report surplus hours. Secondly, the direction of the lie is switched. It is optimal to overreport in the surplus hour setting whereas it was optimal to underreport in the tax evasion setting. Thirdly, the group size is reduced from three to two. If the classifier achieves satisfying performance metrics based on out-of of sample predictions in a slightly different context, the technology can be leveraged in future experimental research.	Behavioral and experimental economics have conventionally employed text data to facilitate the interpretation of decision-making processes. This paper introduces a novel methodology, leveraging text data for predictive analytics rather than mere explanation. We detail a supervised classification framework that interprets patterns in chat text to estimate the likelihood of associated numerical outcomes. Despite the unique advantages of experimental data in correlating textual and numerical information for predictive modeling, challenges such as limited sample sizes and potential data skewness persist. To address these, we propose a comprehensive methodological framework aimed at optimizing predictive modeling configurations, particularly in small experimental behavioral research datasets. We also present behavioral experimental data from a preregistered tax evasion game (n=324), demonstrating that chat behavior is not influenced by experimenter demand effects. This establishes chat text as an unbiased variable, enhancing its validity for prediction. Our findings further indicate that beliefs about others’ dishonesty, lying attitudes, and risk preferences significantly impact compliance decisions.
JEL Code(s)	C92, D90, H26, K42	C55,C92, D83
Last Published	June 24, 2020 04:01 AM	January 26, 2024 08:24 AM
Final Sample Size: Number of Clusters (Unit of Randomization)	175 groups	162 groups
Final Sample Size: Total Number of Observations	350 participants	324 participants
Final Sample Size (or Number of Clusters) by Treatment Arms	175 groups, 350 participants	162 groups, 324 participants
Public Data URL		https://github.com/carinahausladen/PredictingCompliance
Restricted Data Contact	[email protected]	[email protected]
Program Files	No	Yes
Program Files URL		https://github.com/carinahausladen/PredictingCompliance
Is data available for public use?	No	Yes
Additional Keyword(s)	machine learning, natural language processing, compliance, behavioral taxation	Chat data, Supervised classification, Experimental research, Tax evasion, Compliance
Keyword(s)	Crime Violence And Conflict, Firms And Productivity, Other	Crime Violence And Conflict, Firms And Productivity, Other
Building on Existing Work		Yes

Papers

Field	Before	After
Paper Abstract		Behavioral and experimental economics have conventionally employed text data to facilitate the interpretation of decision-making processes. This paper introduces a novel methodology, leveraging text data for predictive analytics rather than mere explanation. We detail a supervised classification framework that interprets patterns in chat text to estimate the likelihood of associated numerical outcomes. Despite the unique advantages of experimental data in correlating textual and numerical information for predictive modeling, challenges such as limited sample sizes and potential data skewness persist. To address these, we propose a comprehensive methodological framework aimed at optimizing predictive modeling configurations, particularly in small experimental behavioral research datasets. We also present behavioral experimental data from a preregistered tax evasion game (n=324), demonstrating that chat behavior is not influenced by experimenter demand effects. This establishes chat text as an unbiased variable, enhancing its validity for prediction. Our findings further indicate that beliefs about others' dishonesty, lying attitudes, and risk preferences significantly impact compliance decisions.
Paper Citation		Hausladen, Carina I., Martin Fochmann, and Peter Mohr. "Predicting compliance: Leveraging chat data for supervised classification in experimental research." Journal of Behavioral and Experimental Economics (2024): 102164.
Paper URL		https://www.sciencedirect.com/science/article/pii/S2214804324000041