AI EVALUATIONS AND SCREENING - A Detailed Study on Human-AI Collaboration in Screening Efficiency and Decision-Making, 2024

Last registered on May 09, 2024

Pre-Trial

General Information

Title

RCT ID

AEARCTR-0013525

Initial registration date

April 29, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

May 09, 2024, 1:56 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Name

Charles Ayoubi

Affiliation

HBS

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Jacqueline Lane

PI Affiliation

HBS

PI Name

PI Affiliation

PI Name

PI Affiliation

HBS

Contact Investigator

PI Name

Miaomiao Zhang

PI Affiliation

HBS

Contact Investigator

Additional Trial Information

Status

In development

Start date

2024-04-30

End date

2025-06-30

Keywords

Behavior, Firms & Productivity, Other

Additional Keywords

JEL code(s)

C93, D81

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

This study investigates the integration of artificial intelligence (AI) in the screening processes of early-stage innovations, traditionally conducted by human evaluators, across various professional and competitive settings. Through a randomized controlled trial involving around 400 participants from the MIT Solve expert internal screener team and from community leveraged startups screeners, this research explores whether AI-assisted human evaluators or AI-only evaluations enhance the efficiency and quality of decision-making compared to traditional human-only evaluations. Outcomes measured include the time efficiency of evaluations, consistency and convergence of decisions, evaluator confidence, and the overall quality of decisions across three conditions: control (no AI assistance), Treatment A (basic AI assistance), and Treatment B (advanced AI assistance providing detailed rationales). The findings aim to delineate the conditions under which human-AI collaboration optimizes evaluation outcomes of early-stage innovations, contributing to the broader discourse on effectively combining human intuition with AI’s processing capabilities. This could have significant implications for fields requiring precise and timely assessments, such as academic research, grant funding, and competitive selection processes, enhancing both theoretical understanding and practical applications of AI in evaluative tasks.

External Link(s)

Registration Citation

Citation

Ayoubi, Charles et al. 2024. "AI EVALUATIONS AND SCREENING - A Detailed Study on Human-AI Collaboration in Screening Efficiency and Decision-Making, 2024." AEA RCT Registry. May 09. https://doi.org/10.1257/rct.13525-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

This study employs three distinct intervention strategies to evaluate the impact of AI-assisted decision-making in the evaluation of early stage innovations:

Control Group: Participants will conduct evaluations without any technological assistance, mirroring traditional human-only evaluation processes.
Treatment A: This group involves the use of a generative AI tool that provides basic pass/fail recommendations to assist participants in their decision-making. This is intended to assess whether simple AI guidance can enhance the efficiency of evaluations compared to the traditional approach.
Treatment B: Participants in this group receive a more sophisticated form of AI assistance, which includes not only pass/fail recommendations but also detailed rationales behind each decision. This intervention aims to explore whether increased expalinability and depth in AI guidance can improve decision quality and evaluator confidence more significantly than basic AI assistance.

Intervention Start Date

2024-04-30

Intervention End Date

2024-12-31

Primary Outcomes

Primary Outcomes (end points)

The primary outcomes for this study are designed to measure the direct effects of AI integration on the evaluation process:

Efficiency: Time taken to reach a decision for each submission, recorded in seconds.
Decision Consistency: The degree of uniformity in decisions across different evaluators, measured using inter-rater reliability statistics.
Evaluator Confidence: A quantitative assessment of how confident evaluators feel about their decisions, measured on a Likert scale from 1 (not confident) to 7 (very confident).
Alignment with AI Recommendations: The degree to which participants in Treatments A and B align with the AI's recommendations.
Decision Quality: Evaluated by the alignment of the screeners' decisions to selection decisions to the next stage of the evaluation process and expert judges' quality evaluations of the submissions.

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

Secondary outcomes of the study will investigate the broader impacts of AI integration on evaluators' perceptions and the decision-making process:

Perception of AI Utility: Evaluators' subjective ratings of how useful the AI was in assisting their decision-making, collected through post-evaluation surveys.
Trust in AI: Changes in evaluators' trust in AI technology, measured before and after the interventions.
Adoption Willingness: Evaluators' willingness to incorporate AI assistance in future evaluations, assessed at the end of the study.

Secondary Outcomes (explanation)

These outcomes will be assessed through a combination of quantitative surveys and qualitative interviews, designed to capture both the measurable changes in perception and the nuanced personal experiences of the evaluators with the AI tools.

Experimental Design

The experimental design for this study is structured as a three-arm randomized controlled trial (RCT) to evaluate the impact of AI-assisted decision-making in the screening of submissions for a global health equity challenge. The design is intended to test the efficiency and effectiveness of AI interventions compared to traditional human-only evaluation processes.

Control Group: Participants in this group perform evaluations manually, without any AI tools, serving as the baseline against which the AI-assisted groups are compared. This mimics the traditional process currently employed in most evaluative settings, allowing for a direct assessment of the impact of AI interventions.
Treatment A: In this arm, participants receive basic AI assistance, which provides binary (pass/fail) recommendations for each submission. This treatment tests the hypothesis that even minimal AI involvement can streamline the evaluation process and reduce the time and cognitive load on human evaluators.
Treatment B: Participants in this group use a more advanced AI tool that not only suggests binary outcomes but also provides detailed rationales for each recommendation. This treatment is designed to explore whether deeper AI integration, which includes providing context and explanations, can improve the quality of decisions and increase evaluator confidence and alignment with expert decisions.
The study's design allows for a controlled comparison across different levels of AI assistance, providing insights into how different AI models might enhance or interfere with human cognitive processes in evaluative tasks.

Experimental Design Details

Not available

Randomization Method

Random assignment of participants to the intervention groups is achieved through a computerized random number generator, ensuring that the allocation is both random and concealed until the point of assignment.

Randomization Unit

The units of randomization in this study are the individual evaluator and the individual solutions, with each participant being independently assigned to one or two of the three study conditions.

Was the treatment clustered?