Humans, Artificial Intelligence and Text-based Misinformation

Last registered on August 09, 2024

Pre-Trial

Trial Information

General Information

Title
Humans, Artificial Intelligence and Text-based Misinformation
RCT ID
AEARCTR-0012535
Initial registration date
November 18, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
December 01, 2023, 4:52 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
August 09, 2024, 3:22 AM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
The University of Utah

Other Primary Investigator(s)

PI Affiliation
University of Utah
PI Affiliation
University of Utah
PI Affiliation
Allen Institute for Artificial Intelligence

Additional Trial Information

Status
In development
Start date
2023-11-25
End date
2024-12-31
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Text-based misinformation is pervasive, yet evidence is scarce regarding people’s ability to differentiate truth from deceptive content in textual form. We conduct a laboratory experiment utilizing data from a TV game show, where natural conversations surrounding an underlying objective truth between individuals with conflicting objectives lead to intentional deception. Initially, we elicit participants’ guesses about the underlying truth by exposing them to transcribed conversations from random episodes. Borrowing tools from computing, we demonstrate that certain AI algorithms exhibit comparable truth detection performance to humans, despite the algorithms relying solely on language cues while humans have access to language and audio-visual cues. Our model identifies accurate language cues not always detected by humans, suggesting the potential for collaborative efforts between humans and algorithms to enhance truth detection abilities. Our research takes an interdisciplinary approach and aims to ascertain whether human-AI teams can outperform individual humans in spotting the truth amid misinformation appearing in textual form. Subsequently, we pursue several lines of inquiry: Do individuals seek the assistance of an artificial intelligence (AI) tool to aid their discernment of truth from text-based misinformation? Are individuals willing to pay for the service provided by the AI? We also investigate factors that may influence individuals’ reluctance in or excessive dependence on seeking AI assistance, such as “AI aversion” or its absence, as well as overconfidence in one’s ability to identify the truth. Furthermore, we examine, while controlling for the predictive accuracies of both the majority of humans and the AI tool, whether individuals, in comparison to the AI tool, are more or less inclined to submit the same guess that a majority of other individuals had submitted for that episode as their own. Lastly, we examine potential gender differences concerning these questions.
External Link(s)

Registration Citation

Citation
Bhattacharya, Haimanti et al. 2024. "Humans, Artificial Intelligence and Text-based Misinformation." AEA RCT Registry. August 09. https://doi.org/10.1257/rct.12535-2.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
Despite the proliferation of text-based misinformation on social media, there is a gap in our understanding of individuals' ability to detect deception in textual form. We use a transcribed version of a novel TV gameshow in which the conversations mimic a social media platform where a third party seeks to ascertain the actual truth concerning a particular topic (e.g., the accuracy of economic data or historical events) by observing online discussions among individuals who have conflicting motives related to that topic, and where there is an underlying objective truth. We ask the subjects to discern and guess the truth from these transcripts. Subsequently, we study the willingness of the subjects to switch to a guess made by an artificial intelligent (AI) system. We vary the accuracy of this AI system and aim to study how this has an effect on the switching behavior of individuals. Furthermore, we seek if subjects are willing to pay for this service provided by an AI.
Intervention (Hidden)
Intervention Start Date
2023-11-25
Intervention End Date
2024-12-31

Primary Outcomes

Primary Outcomes (end points)
1. Individuals' capacity to identify truthful information from deceptive textual content in a strategic context.
2. The willingness of humans to depend on AI tools for identifying text-based misinformation in scenarios where people are either informed or uninformed about the AI's proficiency in discerning the truth.
3. Individuals' willingness to pay for the utilization of AI tools in identifying text-based misinformation.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We conduct this project in two phases. The first phase focuses on building the AI system to detect false information with the help of ChatGPT. In the second phase of the project, we will utilize the methods of the economics experiments. We invite individual human subjects, who will be presented with the transcripts from one of the randomly chosen sessions of the game show. The subject will be entrusted with identifying the truth and will be paid a fixed sum of monetary rewards for correctly identifying, tantamount to detecting deception or sorting out misinformation.
Experimental Design Details
The proposed project has two phases. In the first phase, we transcribed episodes of an American television show. Each episode of the show features three challengers, each asserting to be the real John or Jane Doe, while only one of them is the real John/Jane Doe. Four judges engage in back-and-forth questioning to identify the real John/Jane Doe centering on an affidavit provided by the real John/Jane Doe that is common knowledge to the judges and the challengers. While the real John/Jane Doe must respond truthfully, the imposters can fabricate information. After direct questioning, the judges cast their votes, and each challenger receives a fixed monetary reward for successfully deceiving a judge. Hence, each transcript parallels situations where a third party seeks to uncover an objective truth concealed within conversations involving parties with conflicting interests on a social media platform. A transcript of a session includes the affidavit and the conversations between the judges and the challengers. We fed these transcripts to an AI tool and recorded its guess regarding who the real John/Jane Doe is for each transcript.

In the second phase of our project, we plan to conduct economics experiments. We will create two sets, each consisting of five transcripts randomly selected from a pool that does not contain explicit discussions about audio-visual cues, thus ensuring that these transcripts have only textual cues for individuals to determine the identity of the real John/Jane Doe. We will systematically vary the two sets regarding the AI’s ability to identify the real John/Jane Doe accurately. We plan to deploy the following experimental treatments.

Baseline treatments: We plan to recruit human participants from the online platform Prolific, randomly assigned to one of the two sets of transcripts. Their main task will be identifying the real John/Jane Doe in each transcript. The hypothesis we plan to test using this treatment is that individuals are not better than chance at correctly identifying the real John/Jane Doe. Each participant will complete four tasks in each version of the Baseline treatment.

First task: Each participant will receive a fixed dollar amount for making their guess for each transcript. Additionally, one of the five transcripts will be randomly selected by the computer, and if the participant’s guess for the randomly selected transcript correctly identifies the real John/Jane Doe, they will receive a bonus amount; otherwise, zero. Furthermore, participants will be required to report their confidence level for each of the five guesses by choosing a number between 0 (not confident at all) and 100 (absolutely confident). The participants will be paid for the randomly selected set’s absolute confidence according to a quadratic scoring rule wherein the payoff is higher for higher confidence levels if the guess is correct, and the payoff is higher for lower confidence levels if the guess is incorrect.

Second task: Each participant will need to categorize each transcript into one of the three difficulty levels: Low, Moderate, or High. Since 100 participants will partake in each of the two versions of the Baseline treatment, we will also ask them to guess which of the three difficulty levels they believe most of the 100 participants would assign to each transcript. The participants will earn a bonus if they correctly guess the difficulty level most of the 100 participants chose for the randomly selected transcript; otherwise, zero.

Third task: Next, we will elicit participants’ relative confidence in their ability using the question, ‘Compared with other participants in this experiment, how well do you think you did?’ They will choose a quartile and earn a bonus if correct; otherwise, zero. Upon collecting all 100 participants’ guesses for all five transcripts in a set, we will rank the 100 participants based on the number of correct guesses made.

Fourth task: Each participant will be required to answer a series of demographic questions, how they made guesses for the transcripts, their familiarity with the television show, whether they previously watched any of the five sessions, etc.

Black box treatments: Participants will undertake all four tasks as described above. However, after completing the second task, they will be presented with the AI’s guesses for all five transcripts without revealing the AI’s accuracy rate for that set. Participants will then have the choice to either submit their or the AI’s guess for each transcript. This approach is grounded in the rationale that AI systems often appear as inexplicable entities, individuals do not know about the algorithms, and uncertainty surrounding the AI’s capabilities. Therefore, people may or may not choose to follow the AI’s recommendations. On the other hand, since identifying the real John/Jane Doe necessitates domain-specific knowledge and is cognitively engaging, we anticipate that participants may seek assistance from the AI when it is made available to them. Since participants will not possess information about the actual accuracy rate of the AI for a given set of transcripts, we anticipate that their reliance on the AI for a particular set will not vary significantly between the two sets of transcripts. In this treatment, the last task will include additional questions regarding the participant’s conjecture about the AI’s accuracy and their trust in AI. The hypothesis we plan to test in this treatment is that an individual’s reliance on AI is identical in both sets of transcripts.

Full Information treatments: These treatments will be similar to the black box ones with the distinguishing feature that we will now disclose the AI’s accuracy rate for each set to the participants before they submit their or the AI’s guess for each transcript. This approach allows us to examine whether people appropriately seek AI’s help, i.e., whether their reliance on the AI increases with the AI’s accuracy rate. The hypothesis we plan to test in this treatment is that individuals’ reliance on the AI increases with the AI’s accuracy.

Willingness-to-pay treatments: These treatments will assess individual participants’ willingness to pay a predetermined sum of money for utilizing the AI’s service for each transcript. This approach is rooted in the logic that AI services are frequently perceived as driven by business motives. Hence, it becomes crucial to know whether individuals place enough value on AI’s services to help them detect the truth. The hypothesis we plan to test in this treatment is that individuals’ willingness to pay for the AI’s assistance increases with the AI’s accuracy.
Randomization Method
Randomization is conducted using a random number generator in Python
Randomization Unit
Transcript level
Was the treatment clustered?
Yes

Experiment Characteristics

Sample size: planned number of clusters

For the Baseline, we recruit 180 participants. We divide subjects into two urns (or sets of transcripts); each urn has 90 subjects. Of the 90 subjects, we study 45 male and 45 female subjects.

For the Black Box and Full Information treatments, we recruit 270 participants. We divide subjects into three urns; each urn has 90 subjects. Of the 90 subjects, we study 45 male and 45 female subjects.
Sample size: planned number of observations
900 observations in Baseline (each subject takes five decisions for the five transcripts) 1350 observations in the Black Box and Full Information treatments
Sample size (or number of clusters) by treatment arms
Baseline: We recruit 180 subjects per treatment, which is 900 observations per treatment (each subject takes five decisions for the five transcripts). For every treatment, we divide subjects into two urns (or sets of transcripts); each urn has 90 subjects. Of the 90 subjects, we study 45 male subjects and 45 female subjects.

Black Box and Full Information treatments: We recruit 270 subjects per treatment, which is 1350 observations per treatment (each subject takes five decisions for the five transcripts). For every treatment, we divide subjects into three urns (or sets of transcripts); each urn has 90 subjects. Of the 90 subjects, we study 45 male subjects and 45 female subjects.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Institutional Review Board, University of Utah
IRB Approval Date
2023-08-30
IRB Approval Number
IRB_00167477

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials