|
Field
Trial End Date
|
Before
December 15, 2025
|
After
January 29, 2026
|
|
Field
Last Published
|
Before
October 06, 2025 03:20 PM
|
After
October 09, 2025 05:40 AM
|
|
Field
Intervention Start Date
|
Before
October 06, 2025
|
After
October 09, 2025
|
|
Field
Intervention End Date
|
Before
November 03, 2025
|
After
December 04, 2025
|
|
Field
Primary Outcomes (End Points)
|
Before
Individual outcomes up to 6 weeks after the intervention: features of messages, activities.
|
After
Take-up and individual outcomes up to six weeks after the intervention, including features of messages and activity measures.
|
|
Field
Primary Outcomes (Explanation)
|
Before
Features of the messages: We focus on three features of messages: toxicity, positivity, and constructiveness. The toxicity classifier follows the same method as in the intervention, while positivity and constructiveness are identified with large language models.
Activities: We split activities for communication-related (comments) and productivity-related (lines of code, pull requests, commits).
|
After
Take-up: We measure take-up of the reminder using two proxy indicators:
(1) Whether the reminder was successfully sent, and
(2) Whether any recipient clicked on the survey link or the opt-out link.
Message Features:
Our primary measure is message toxicity, assessed using the Detoxify model’s toxicity score. We also examine message positivity (by SiEBERT) and constructiveness (identified by GPT-4o).
Activities:
We distinguish between communication-related activities (e.g., comments) and productivity-related activities (e.g., lines of code, pull requests, and commits).
Effects will be estimated using difference-in-differences (DiD) and event-study (ES) specifications, as described below.
|
|
Field
Experimental Design (Public)
|
Before
We detect potentially toxic messages using a state-of-the-art classifier and send reminders to a random subset of their senders. The reminders inform users that their messages may appear rude or disrespectful. We track changes in feedback composition and activity for up to six weeks after intervention.
|
After
We detect potentially toxic messages using a state-of-the-art classifier and send reminders to a random subset of their senders. The reminders inform users that their messages may appear rude or disrespectful. We then track changes in message features and activity for up to six weeks after the intervention.
|
|
Field
Planned Number of Observations
|
Before
All users who send at least one public message classified as toxic during the intervention period.
|
After
The intervention will continue until reaching 500 users per condition or eight weeks of implementation, whichever occurs first.
|
|
Field
Sample size (or number of clusters) by treatment arms
|
Before
Equal distribution among three treatment arms.
|
After
Equal distribution among three conditions.
|
|
Field
Keyword(s)
|
Before
Behavior, Crime Violence And Conflict, Firms And Productivity, Labor
|
After
Firms And Productivity, Labor
|
|
Field
Intervention (Hidden)
|
Before
Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to identify messages with predicted toxicity above 0.5. Flagged messages are validated by GPT-4o to reduce false positives. Users who send their first toxic message during the four-week intervention period are randomly assigned to one of three arms:
(1) Private reminder: sent by email via platform notification setting (reminder is posted in a short-lived thread that tags the sender of the flagged message), visible only to the sender;
(2) Public reminder: posted in the same thread as the flagged message;
(3) Control: no reminder.
|
After
Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to detect messages with a predicted toxicity score above 0.5. Flagged messages are then reviewed by GPT-4o to minimize false positives.
Users who send their first toxic message during the intervention period are randomly assigned to one of three conditions:
1. Private Reminder: A notification sent via the platform’s email system. The reminder is posted in a short-lived thread tagging only the sender of the flagged message and is visible exclusively to them.
2. Public Reminder: A reminder posted directly in the same thread as the flagged message, visible to all participants who read the thread.
3. Control: No reminder is sent.
|
|
Field
Secondary Outcomes (End Points)
|
Before
Measures of collaboration and popularity, language of messages within the repository, up to 6 weeks after the intervention.
Willingness to subscribe (WTS) in the survey for the premium if toxicity detection is added.
|
After
Measures of collaboration and popularity, as well as the language of messages both within the thread and repository, observed up to six weeks after the intervention.
Survey responses, including opt-out behavior and willingness to subscribe (WTS) to a premium version of the platform that includes the reminder feature tested in this intervention.
|
|
Field
Secondary Outcomes (Explanation)
|
Before
Team performance: measured at the repository level, including the number of stars, forks, and unique contributors.
WTS is given in intervals of >10%, 5-10%, and 1-5% for both increase and decrease, and no change. The midpoint is used for the intervals. For changes >10%, we use 12.5%
|
After
Collaboration within thread and repository: Measured as the number of messages exchanged between
(1) the sender and users involved in the same thread as the toxic message,
(2) the sender and other users active in the same repository, and
(3) all users within the repository.
Team Performance: Measured at the repository level, including the number of stars, forks, unique contributors, and new contributors who had no activity in the three months preceding the intervention.
Language of Messages within thread and repository: Measured using the same approach as in the primary outcomes, focusing on toxicity (via Detoxify), positivity (via SiEBERT), and constructiveness (identified by GPT-4o).
Willingness to Subscribe (WTS): Measured in survey responses, where participants indicate their willingness to pay for a premium version of the platform if the reminder feature is included. WTS is categorized into intervals of greater than 10%, 5–10%, and 1–5% for both increases and decreases, as well as “no change.” The midpoint of each interval is used for analysis (e.g., 12.5% for changes greater than 10%).
|