Toxicity and User Engagement: Online Study

Last registered on September 17, 2024

View Trial History

Pre-Trial

Trial Information

General Information

Title

Toxicity and User Engagement: Online Study

RCT ID

AEARCTR-0014362

Initial registration date

September 13, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

September 17, 2024, 1:49 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

United States of America

Region

Primary Investigator

Name

Mateusz Stalinski

Affiliation

University of Warwick

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

George Beknazar-Yuzbashev

PI Affiliation

Columbia University

Contact Investigator

PI Name

Rafael Jiménez-Durán

PI Affiliation

Bocconi University

Contact Investigator

Additional Trial Information

Status

In development

Start date

2024-09-16

End date

2025-03-31

Keywords

Behavior, Crime, Violence, & Conflict, Welfare

Additional Keywords

toxic content, hate speech, moderation, social media, user engagement

JEL code(s)

D12, D90, I31, L82, Z13

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

More information about the study will be available after the completion of the trial.

External Link(s)

Registration Citation

Citation

Beknazar-Yuzbashev, George , Rafael Jiménez-Durán and Mateusz Stalinski. 2024. "Toxicity and User Engagement: Online Study ." AEA RCT Registry. September 17. https://doi.org/10.1257/rct.14362-1.0

Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Experimental Details

Interventions

Intervention(s)

Information on the intervention is hidden until the end of the trial.

Intervention (Hidden)

Participants are asked to transcribe two social media posts, i.e., type up the text of the post from its image. The images look like Facebook posts (they follow the structure and formatting of Facebook posts) with the author name, profile picture, and the date of posting censored.

The first post to transcribe is: “I am surprised that more NFL players do not retire from the exhaustion of the injury and rehab cycle”. This post is non-toxic according to the Perspective API classifier (2% likely to be toxic).

We randomize the second post that we ask participants to transcribe.

Treatment 1: “What is so difficult to understand? I hate Blacks”.
(80% likely to be toxic according to Perspective API)

Control 1: “What is so difficult to understand? I hate Mondays”.
(29% likely to be toxic according to Perspective API)

Treatment 2: “You are a fucking moron”.
(98% likely to be toxic according to Perspective API)

Control 2: “You are very misguided”.
(28% likely to be toxic according to Perspective API)

Our intention is to vary toxicity levels while changing as little as possible about other aspects of the posts.

The first pair of conditions (Treatment 1 and Control 1) relate to identity hate, i.e., the treatment post involves discrimination of a particular group of people. We vary only one word to keep the structure of the post the same. The number of characters and language complexity is almost identical.

The second pair of conditions (Treatment 2 and Control 2) involve an insult using profane language. We maintain the structure [You are X, where X is a two-word description] and the number of characters (23-24) across the groups. Moreover, we preserve the direct meaning (both statements imply that the person is wrong or doing something incorrectly), while varying the toxicity of the language used to express the sentiment.

Intervention Start Date

2024-09-16

Intervention End Date

2025-03-31

Primary Outcomes

Primary Outcomes (end points)

1. User engagement

We measure whether the participant clicks “View 3 comments” at the bottom of the treated post to uncover the comment section. The link is formatted and situated within the post in a way that closely resembles how such links appear on Facebook. On click, an image with comments (following Facebook formatting) is displayed.

2. Willingness to accept for transcribing 100 social media posts

Participants use a slider from $2 to $30 to choose WTA with precision of $0.1. The person with the lowest compensation will be invited to transcribe 100 social media posts and paid the second smallest compensation. If there are multiple people with the smallest compensation, one of them will be chosen at random. The lower bound is selected to reflect that the minimum compensation per hour is $8 and that the task will take at least 15 minutes.

Heterogeneity:

First, we will look at heterogeneity with respect to: gender and minority status.

We classify participants as members of a minority group if they (1) selected an answer other than “white” in the ethnicity question, (2) identify as a Hispanic, Latino, or a person of Spanish origin, or (3) selected a different sexual orientation than heterosexual.

Moreover, as religiosity shapes norms of behavior, we will look at heterogeneity with respect to being religious (i.e., choosing a different answer than “no religion” when asked about religion or belief).

Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)

1.Recall of toxicity

A randomly selected subsample of participants (1/3rd) will be asked whether or not any of the posts that they were asked to transcribe were toxic.

A standard definition of toxicity will be provided: “A statement is toxic if it is a rude, disrespectful, or unreasonable comment that is somewhat likely to make you leave a discussion or give up on sharing your perspective.”

2. Recall of hate speech

A randomly selected subsample of participants (1/3rd) will be asked whether or not any of the posts that they were asked to transcribe contain hate speech.

Meta’s definition of hate speech will be provided: “Hate speech is a direct attack against people on the basis of protected characteristics: race, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and serious disease.”

3. Recall of treatment posts

A randomly selected subsample of participants (1/3rd) will be asked which of the four options provided was one of the posts they were asked to transcribe.

The options will be similar to each other. For example, for Treatment 1 and Control 1, we will use:
- What is so difficult to understand? I hate Mondays.
- What did you just say? You hate Mondays?
- What is so difficult to understand? I hate Blacks.
- What did you just say? You hate Blacks?

4. Entertainment rating of posts

We will ask participants to rate on a scale from 0 to 100 (using a slider) how entertaining the training transcription posts were.

We will look at the same angles of heterogeneity as for the primary outcomes.

Secondary Outcomes (explanation)

Experimental Design

Information on the experimental design is hidden until the end of the trial.

Experimental Design Details

We recruit participants on Prolific. We restrict the pool of eligible participants to those residing in the United States. We ensure gender balance using the relevant Prolific setting.

After collecting basic demographic variables, we inform participants that we are recruiting for a task of transcribing 100 social media posts (typing up the text of the post from an image of it). We enumerate possible uses of data collected during the transcription task, such as training an algorithm that helps people avoid mistakes when posting on social media. This helps us to divert attention from the intervention (which focuses on varying exposure to toxic content), and thus reduces experimenter demand effects.

Subsequently, we explain the process that will be used to select the participant recruited for the task of transcribing 100 social media posts. We use a second price auction – we select a person with the lowest willingness to accept but pay them the second lowest bid as compensation for completing the task. We also describe how to transcribe posts (where to type up the text, that proceeding requires entering the minimum number of characters based on the length of the post, etc.). We test comprehension of our instructions with four questions.

Participants complete two practice transcriptions. The first post is non-toxic and held constant across all treatment groups (see the wording in the Intervention section). The second post is where we intervene - we randomly assign participants to one out of four versions of the second post. Two experimental conditions are treatment groups, with the post being toxic. Each treatment group has an associated control group, where we provide a slightly modified post - we significantly reduce toxicity but maintain as much as possible about the structure/meaning of the post. Please see the Intervention section for details.

When testing hypotheses, we will use a two-sided t-test for difference in means. For all primary and secondary outcomes, we will compare each treatment group with its associated control group (Treatment 1 vs. Control 1 and Treatment 2 vs. Control 2). For primary outcome 1 (user engagement), we will additionally test a hypothesis that toxicity, in general, increases user engagement by comparing Treatments 1 and 2 pooled vs. Control 1 and 2 pooled.

When testing hypotheses, we will use observations for all individuals who completed the relevant part of the survey. However, as a robustness check, we will also consider treatment effects for the subset of participants who passed all comprehension checks.These tests will have lower statistical power, but they might be insightful especially for primary outcome 2, as the second price auction is fairly complex. Please note that we separately randomize participants who passed all comprehension checks and those who did not into treatment groups.

Randomization Method

Qualtrics randomization

Randomization Unit

Individual

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

N/A

Sample size: planned number of observations

We plan to recruit approximately N=1,000 participants per group (survey completions), which gives a total of about 4,000 people for the whole study. To increase power, we will also include pilot observations (approximately 100 observations per group).

Sample size (or number of clusters) by treatment arms

We will randomly assign individuals to four experimental groups (Treatment 1, Control 1, Treatment 2, Control 2) with equal probabilities.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Our sample size gives us enough power to detect effect sizes close to 0.12 s.d for the primary outcomes (comparing each treatment with its associated control group), which is sufficient to detect main effects found in the pilot study. Additionally, for user engagement (primary outcome 1), we consider a pooled comparison (Treatment 1 + Treatment 2 vs. Control 1 + Control 2). For this test, we will have ex-ante power to detect the effect size of 0.09 s.d. Our MDEs reflect the subtlety of the intervention - we change little about the structure of the treatment posts. Furthermore, they are lower than the minimum MDE recommended for information provision experiments (0.15 s.d.) by Haaland et al. (2023). At the same time, we take into account that certain types of outcomes, including willingness to accept for tasks/work, are more inelastic to information interventions than other outcomes typical for this type of experiment.

Supporting Documents and Materials

IRB