Back to History Current Version

Chatbots as Idea Evaluation Sparring Partners

Last registered on September 29, 2025

Pre-Trial

Trial Information

General Information

Title
Chatbots as Idea Evaluation Sparring Partners
RCT ID
AEARCTR-0016864
Initial registration date
September 24, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
September 29, 2025, 10:55 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation

Other Primary Investigator(s)

Additional Trial Information

Status
In development
Start date
2025-09-25
End date
2025-11-17
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
Idea evaluation is inherently subjective, often limited by cognitive biases and incomplete information. This study explores how AI chatbots, adopting different conversational stances - such as supportive, critical, balanced perspectives - can broaden evaluators’ judgments. Through controlled interactions, we examine how these AI sparring partners influence assessment confidence, shifts in evaluation, and the overall quality of decisions. By analyzing participants’ personality traits, perceptions of the AI, and their reasoning processes, the research seeks to determine whether AI can effectively expand human perspectives in idea evaluation without distorting judgment. The findings aim to inform the design of human-AI collaboration systems for more informed and consistent decision-making.
External Link(s)

Registration Citation

Citation
Just, Julian. 2025. "Chatbots as Idea Evaluation Sparring Partners." AEA RCT Registry. September 29. https://doi.org/10.1257/rct.16864-1.0
Experimental Details

Interventions

Intervention(s)
Intervention (Hidden)
Intervention Start Date
2025-09-25
Intervention End Date
2025-11-17

Primary Outcomes

Primary Outcomes (end points)
The key outcome variables in this experiment are idea assessment, assessment confidence, assessment change before and after AI interaction, assessment quality change measured through metadelta, and the convergence of evaluations across participants. Additionally, perceptions of the AI chatbot and unstructured data from prompts and rationales are collected to understand how AI stances influence reasoning and trust.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We have an experiment with 8 conditions testing how AI sparring partners influence idea evaluation. We may focus
Participants are randomly assigned to one of these setups:

With pre-assessment: Evaluate ideas first, then chat with AI, then re-evaluate. (currently planned as main focus)
Without pre-assessment: Chat with AI first, then evaluate ideas for the first time.

The AI takes one of 4 roles, controlled by a custom GPT system prompt:

Basic (plain/default)
Angel’s Advocate (supportive)
Devil’s Advocate (critical)
Dialectic (balanced debate)

The AI chat is embedded in Qualtrics - participants move seamlessly between survey questions and the GPT conversation.
All conditions measure:
Holistic and detailed idea assessments
Confidence in assessments
Changes in evaluation scores
Alignment with group averages (metadelta)
Convergence of opinions
Personality traits, AI perceptions, and chat transcripts (prompts/rationales).
Experimental Design Details
Randomization Method
Randomization done in office by a computer - Qualtrics
Randomization Unit
The unit of randomization is the individual participant.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
1
Sample size: planned number of observations
800 evaluations of domain-experts (working in specific field where the to be evalauted idea is from) on Prolific, 150 evalaution of business students - each participant evaluates one idea
Sample size (or number of clusters) by treatment arms
800 prolific experts, 150 students; may be lower if we include some conditions only in a specific sample
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number
Analysis Plan

Analysis Plan Documents

Potential Hypotheses

MD5: cd7a47f594f6e05b3380f08133ef3f49

SHA1: d7a3dcad6c170dc0dce9fd4effd9be5d4a8c2617

Uploaded At: September 24, 2025

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials