The Predictive Validity of Audio and Text Responses to Open-Ended Survey Questions for Latent Individual Traits

Last registered on October 06, 2025

View Trial History

Pre-Trial

Trial Information

General Information

Title

The Predictive Validity of Audio and Text Responses to Open-Ended Survey Questions for Latent Individual Traits

RCT ID

AEARCTR-0016672

Initial registration date

October 04, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

October 06, 2025, 3:19 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

Italy

Region

Primary Investigator

Name

Vincenzo Galasso

Affiliation

Bocconi University

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Tommaso Nannicini

PI Affiliation

Additional Trial Information

Status

In development

Start date

2025-10-09

End date

2025-11-30

Keywords

Behavior

Additional Keywords

Survey design, Open-ended questions, Large Language Models, beliefs

JEL code(s)

C83, D83

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Understanding individuals’ beliefs, preferences, and motivations is essential in social sciences. Recent technological advancements—notably, large language models (LLMs) for analyzing openended responses and the diffusion of voice messaging— have the potential to significantly
enhance our ability to elicit these dimensions. Previous studies (Galasso et al., 2024) have investigated the differences between oral
and written responses to open-ended survey questions. Using a series of randomized controlled trials across three surveys (focused on AI, public policy, and international relations), Galasso et al., (2024) showed that respondents who provided audio answers gave longer, though lexically simpler, responses, offering more information and containing more personal experiences than written responses. These findings thus suggest that oral responses to open-ended questions can capture richer, more personal insights, presenting a valuable method for understanding individual reasoning.
In this study, we propose two extensions:
1. Editing Opportunity: Providing respondents with the transcript of their audio (or text) answer and allowing them to revise the answer may reduce oversharing and personal content.
2. Predictive Validity: Audio-based open-ended answers may better predict latent individual characteristics—here, attitudes and behaviors toward immigrants—than text responses alone.
We also explore how respondents’ Big Five traits relate to the informativeness and subjectivity of open-ended answers across modes to test whether more extroverted people are more likely to provide more, and more personal, information when recording an audio.

External Link(s)

Registration Citation

Citation

Galasso, Vincenzo and Tommaso Nannicini. 2025. "The Predictive Validity of Audio and Text Responses to Open-Ended Survey Questions for Latent Individual Traits." AEA RCT Registry. October 06. https://doi.org/10.1257/rct.16672-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

Participants will be randomly assigned to one of four experimental conditions that vary the mode in which they provide answers to open-ended survey questions:
Audio: Respondents answer all open-ended questions orally (recorded audio).
Text: Respondents answer all open-ended questions in written form.
Audio with Transcript & Prompt to Modify: Respondents first answer all open-ended questions orally. They are then shown a transcript of their spoken response and asked whether they wish to revise or modify it in writing.
Text with Transcript & Prompt to Modify: Respondents first answer all open-ended questions in written form. They are then shown the text of their response again and asked whether they wish to revise or modify it in writing.
This design allows us to isolate the effects of response modality (audio vs. text) and the additional opportunity to review and edit responses (transcript & prompt to modify).

Intervention (Hidden)

Intervention Start Date

2025-10-09

Intervention End Date

2025-10-24

Primary Outcomes

Primary Outcomes (end points)

From answers to open ended questions, we calculate (i) number of words and of significant words; (ii) Yule’s K and the Type-Token Ratio (TTR) to measure lexical diversity; (iii) informativeness score and informativeness dummy and (iv) use the pronoun “I” (in Italian, “Io”); use of verbs in the first-person singular; report of respondent’s personal experience; and subjectivity.

Primary Outcomes (explanation)

these variables are obtained as described in Galasso et al., 2024.

Secondary Outcomes

Secondary Outcomes (end points)

(i) Average number of items in a List Experiment using Sensitive item on immigrants in a list format; (ii) Allocation of resources to recipients and to NGOs varying by ethnic cue in a Dictator Game; (iii) marginal willingness to pay in a Conjoint Experiment of Residential choice tasks, including immigrant share attribute; (iv) Closed ended comfort rating on immigrant share (1-5 Likert scale).

Secondary Outcomes (explanation)

Experimental Design

Experimental Design Details

Randomization Method

randomization done by the Survey Company software that will allocate respondents in the four arms.

Randomization Unit

Individual level: respondents to the survey

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

For budgetary reasons, the national probability sample will consist of N ≈ 2,000 adult respondents recruited by a professional survey company, with CAWI methodology.

Sample size: planned number of observations

Observation for almost all outcome will coincide with the number of individual respondents. In the conjoint experiment, each individual will rank 5 pairs of scenarios. Since the observation is a scenario, in the conjoint experiment, there will be 20,000 observations

Sample size (or number of clusters) by treatment arms

500 individuals each

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Our sample size of 2000 individuals will allow to detect standardized effect sizes of d = 0.125 with 80% power at α = .05, for instance in comparing results from Audio vs. results from Audio with Transcript feedback

Supporting Documents and Materials