Back to History

Fields Changed

Registration

Field Before After
Trial End Date December 15, 2025 January 29, 2026
Last Published October 06, 2025 03:20 PM October 09, 2025 05:40 AM
Intervention Start Date October 06, 2025 October 09, 2025
Intervention End Date November 03, 2025 December 04, 2025
Primary Outcomes (End Points) Individual outcomes up to 6 weeks after the intervention: features of messages, activities. Take-up and individual outcomes up to six weeks after the intervention, including features of messages and activity measures.
Primary Outcomes (Explanation) Features of the messages: We focus on three features of messages: toxicity, positivity, and constructiveness. The toxicity classifier follows the same method as in the intervention, while positivity and constructiveness are identified with large language models. Activities: We split activities for communication-related (comments) and productivity-related (lines of code, pull requests, commits). Take-up: We measure take-up of the reminder using two proxy indicators: (1) Whether the reminder was successfully sent, and (2) Whether any recipient clicked on the survey link or the opt-out link. Message Features: Our primary measure is message toxicity, assessed using the Detoxify model’s toxicity score. We also examine message positivity (by SiEBERT) and constructiveness (identified by GPT-4o). Activities: We distinguish between communication-related activities (e.g., comments) and productivity-related activities (e.g., lines of code, pull requests, and commits). Effects will be estimated using difference-in-differences (DiD) and event-study (ES) specifications, as described below.
Experimental Design (Public) We detect potentially toxic messages using a state-of-the-art classifier and send reminders to a random subset of their senders. The reminders inform users that their messages may appear rude or disrespectful. We track changes in feedback composition and activity for up to six weeks after intervention. We detect potentially toxic messages using a state-of-the-art classifier and send reminders to a random subset of their senders. The reminders inform users that their messages may appear rude or disrespectful. We then track changes in message features and activity for up to six weeks after the intervention.
Planned Number of Observations All users who send at least one public message classified as toxic during the intervention period. The intervention will continue until reaching 500 users per condition or eight weeks of implementation, whichever occurs first.
Sample size (or number of clusters) by treatment arms Equal distribution among three treatment arms. Equal distribution among three conditions.
Keyword(s) Behavior, Crime Violence And Conflict, Firms And Productivity, Labor Firms And Productivity, Labor
Intervention (Hidden) Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to identify messages with predicted toxicity above 0.5. Flagged messages are validated by GPT-4o to reduce false positives. Users who send their first toxic message during the four-week intervention period are randomly assigned to one of three arms: (1) Private reminder: sent by email via platform notification setting (reminder is posted in a short-lived thread that tags the sender of the flagged message), visible only to the sender; (2) Public reminder: posted in the same thread as the flagged message; (3) Control: no reminder. Every hour, we apply the Detoxify model (Hanu and Unitary, 2020) to detect messages with a predicted toxicity score above 0.5. Flagged messages are then reviewed by GPT-4o to minimize false positives. Users who send their first toxic message during the intervention period are randomly assigned to one of three conditions: 1. Private Reminder: A notification sent via the platform’s email system. The reminder is posted in a short-lived thread tagging only the sender of the flagged message and is visible exclusively to them. 2. Public Reminder: A reminder posted directly in the same thread as the flagged message, visible to all participants who read the thread. 3. Control: No reminder is sent.
Secondary Outcomes (End Points) Measures of collaboration and popularity, language of messages within the repository, up to 6 weeks after the intervention. Willingness to subscribe (WTS) in the survey for the premium if toxicity detection is added. Measures of collaboration and popularity, as well as the language of messages both within the thread and repository, observed up to six weeks after the intervention. Survey responses, including opt-out behavior and willingness to subscribe (WTS) to a premium version of the platform that includes the reminder feature tested in this intervention.
Secondary Outcomes (Explanation) Team performance: measured at the repository level, including the number of stars, forks, and unique contributors. WTS is given in intervals of >10%, 5-10%, and 1-5% for both increase and decrease, and no change. The midpoint is used for the intervals. For changes >10%, we use 12.5% Collaboration within thread and repository: Measured as the number of messages exchanged between (1) the sender and users involved in the same thread as the toxic message, (2) the sender and other users active in the same repository, and (3) all users within the repository. Team Performance: Measured at the repository level, including the number of stars, forks, unique contributors, and new contributors who had no activity in the three months preceding the intervention. Language of Messages within thread and repository: Measured using the same approach as in the primary outcomes, focusing on toxicity (via Detoxify), positivity (via SiEBERT), and constructiveness (identified by GPT-4o). Willingness to Subscribe (WTS): Measured in survey responses, where participants indicate their willingness to pay for a premium version of the platform if the reminder feature is included. WTS is categorized into intervals of greater than 10%, 5–10%, and 1–5% for both increases and decreases, as well as “no change.” The midpoint of each interval is used for analysis (e.g., 12.5% for changes greater than 10%).
Back to top