Timely Feedback and Toxic Communication

Last registered on October 23, 2025

Pre-Trial

Trial Information

General Information

Title
Timely Feedback and Toxic Communication
RCT ID
AEARCTR-0016553
Initial registration date
October 04, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
October 06, 2025, 3:20 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
October 23, 2025, 10:56 AM EDT

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
Institute for International Economic Studies, Stockholm University

Other Primary Investigator(s)

PI Affiliation
Institute for International Economic Studies, Stockholm University

Additional Trial Information

Status
In development
Start date
2025-09-01
End date
2026-02-12
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
We detect toxic messages on a global online collaboration platform on a regular basis. When a message is classified as toxic, we provide a reminder about the toxicity of the message to some of the senders.
External Link(s)

Registration Citation

Citation
Au-Yeung, Huen Tat and Jinci Liu. 2025. "Timely Feedback and Toxic Communication." AEA RCT Registry. October 23. https://doi.org/10.1257/rct.16553-3.1
Experimental Details

Interventions

Intervention(s)
Intervention (Hidden)
Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to detect messages with a predicted toxicity score above 0.5. Flagged messages are then reviewed by GPT-4o to minimize false positives.

Users who send their first toxic message during the intervention period are randomly assigned to one of three conditions:
1. Private Reminder: A notification sent via the platform’s email system. The reminder is posted in a short-lived thread tagging only the sender of the flagged message and is visible exclusively to them.
2. Public Reminder: A reminder posted directly in the same thread as the flagged message, visible to all participants who read the thread.
3. Control: No reminder is sent.
Intervention Start Date
2025-10-23
Intervention End Date
2025-12-18

Primary Outcomes

Primary Outcomes (end points)
Take-up and individual outcomes up to six weeks after the intervention, including features of messages and activity measures.
Primary Outcomes (explanation)
Take-up: We measure take-up of the reminder using two proxy indicators:
(1) Whether the reminder was successfully sent, and
(2) Whether any recipient clicked on the survey link or the opt-out link.

Message Features:
Our primary measure is message toxicity, assessed using the Detoxify model’s toxicity score. We also examine message positivity (by SiEBERT) and constructiveness (identified by GPT-4o).

Activities:
We distinguish between communication-related activities (e.g., comments) and productivity-related activities (e.g., lines of code, pull requests, and commits). We primarily focus on (1) total number of activities, and (2) number of messages sent. We will also look at productivity measures: (1) lines of code submitted, and (2) share of code accepted (as defined as lines of code accepted / total lines of code submitted).

Effects will be estimated using difference-in-differences (DiD) and event-study (ES) specifications, as described below.

Secondary Outcomes

Secondary Outcomes (end points)
Measures of collaboration and popularity, as well as the language of messages both within the thread and repository, observed up to six weeks after the intervention.

Survey responses, including opt-out behavior and willingness to subscribe (WTS) to a premium version of the platform that includes the reminder feature tested in this intervention.
Secondary Outcomes (explanation)
Collaboration within thread and repository: Measured as the number of messages exchanged between
(1) the sender and users involved in the same thread as the toxic message,
(2) the sender and other users active in the same repository, and
(3) all users within the repository.

Team Performance: Measured at the repository level, including the number of stars, forks, unique contributors, and new contributors who had no activity in the three months preceding the intervention.

Language of Messages within thread and repository: Measured using the same approach as in the primary outcomes, focusing on toxicity (via Detoxify), positivity (via SiEBERT), and constructiveness (identified by GPT-4o).

Willingness to Subscribe (WTS): Measured in survey responses, where participants indicate their willingness to pay for a premium version of the platform if the reminder feature is included. WTS is categorized into intervals of greater than 10%, 5–10%, and 1–5% for both increases and decreases, as well as “no change.” The midpoint of each interval is used for analysis (e.g., 12.5% for changes greater than 10%).

Experimental Design

Experimental Design
We detect potentially toxic messages using a state-of-the-art classifier and send reminders to a random subset of their senders. The reminders inform users that their messages may appear rude or disrespectful. We then track changes in message features and activity for up to six weeks after the intervention.
Experimental Design Details
Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to identify potentially toxic messages with a cutoff score of 0.5. Messages flagged by Detoxify are further validated using GPT-4o to reduce false positives. When a user sends their first toxic message during the intervention period, they are randomly assigned (by computer) to one of the experimental conditions. All subsequent messages by that user are tracked for six weeks to measure outcomes.

Treatment Assignment
We analyze effects at four levels:
1. The sender of the toxic message,
2. Users involved in the thread containing the toxic message, and
3. Members of the repository containing the toxic message, and
4. All users involved in the repository but are not formal members.
In the baseline specification, we estimate intention-to-treat (ITT) effects using random assignment. Take-up is measured as described in the primary outcomes section:
(1) whether the reminder was successfully sent, and
(2) whether any recipient clicked on the survey link or the opt-out link.
To estimate the local average treatment effect (LATE), we instrument take-ups with the random assignment.

Specification
We employ a difference-in-differences (DiD) specification to estimate the causal effect of the intervention. Additionally, we use an event-study (ES) specification to capture short-run effects of the reminder.

We will report both the first stage and the reduced form estimates.

Empirical Details
We use LASSO for control selection, incorporating lagged outcomes (toxicity, activity and baseline outcome measures up to six weeks before the intervention) and user characteristics (e.g., seniority, predicted gender, race, and location). We exclude the lagged outcomes between 2025-10-08 17:00 and 2025-10-14 21:00 (UTC, inclusive bracket) due to technical failure in the API service.

We will report p-values calculated by rwolf2 command on STATA that accounts for multiple hypothesis testing for the key primary outcomes for the senders: toxicity, number of activities and number of messages sent.

We also conduct heterogeneity analyzes based on pre-intervention toxicity levels.
Randomization Method
Randomization is done automatically by the computer at the time a user sends their first toxic message.
Randomization Unit
The randomization is done at the user level.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
No cluster.
Sample size: planned number of observations
The intervention will continue until reaching 500 users per condition or eight weeks of implementation, whichever occurs first.
Sample size (or number of clusters) by treatment arms
Equal distribution among three conditions.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Swedish Ethical Review Authority
IRB Approval Date
2025-08-26
IRB Approval Number
2025-04872-01

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials