LLMs and e-mail marketing Follow-up Study

Last registered on October 18, 2024

Pre-Trial

Trial Information

General Information

Title
LLMs and e-mail marketing Follow-up Study
RCT ID
AEARCTR-0014483
Initial registration date
October 11, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
October 18, 2024, 4:59 PM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Region

Primary Investigator

Affiliation
University of Chicago Booth School of Business

Other Primary Investigator(s)

PI Affiliation
University of Chicago Booth School of Business

Additional Trial Information

Status
In development
Start date
2024-09-27
End date
2024-10-31
Secondary IDs
L00, C9
Prior work
This trial is based on or builds upon one or more prior RCTs.
Abstract
We aim to test the effectiveness of large language models (LLM) in producing “newsletter creative” (NC) compared to human writers. We collaborate with Wine Access (WA), an online wine retailer, to run a randomized controlled experiment (RCT) in a 20-day period so respondents will get/not get NC generated from different sources:
1. Test 1 (size = 9,000): The respondent receives daily WA newsletters created by the human writer team.
2. Test 2 (size = 9,000): The respondent receives daily WA newsletters created by the LLM.
3. Test 3 (size = 9,000): The respondent receives daily WA newsletters created by the human marketing team with access to and can tune the LLM.
At the start of the experiment, each subject will be randomly assigned to one of the experimental cells above and will remain in that cell throughout the 20-day duration of the experiment. Each cell will have a human fact-checker that’s independent of the human writer team so the newsletter has accurate information. The effectiveness of NC can be assessed based on comparisons of multiple outcome variables such as site visits, revenues, and profits.
External Link(s)

Registration Citation

Citation
Dube, Jean-Pierre and Ningyin Xu. 2024. "LLMs and e-mail marketing Follow-up Study." AEA RCT Registry. October 18. https://doi.org/10.1257/rct.14483-1.0
Experimental Details

Interventions

Intervention(s)
The randomized controlled experiment (RCT) will run for a 20-day period with the following cells in parallel:
1. Test 1 (size = 9,000): The respondent receives daily WA newsletters created by the human writer team.
2. Test 2 (size = 9,000): The respondent receives daily WA newsletters created by the LLM.
3. Test 3 (size = 9,000): The respondent receives daily WA newsletters created by the human marketing team with access to and can tune the LLM.
Intervention (Hidden)
Intervention Start Date
2024-09-30
Intervention End Date
2024-10-19

Primary Outcomes

Primary Outcomes (end points)
Purchases and profits
Primary Outcomes (explanation)
We are interested in whether an LLM can produce e-mail creatives that are at least as effective (gross and net of costs) than the creatives produced by a salaried team of writers. In addition, whether the human-tuned LLM can perform the same or better.

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We will randomly assign approximately 27,000 customers to 3 experimental cells:
1. base intervention (creative designed by human team)
2. LLM Intervention (creative designed by LLM)
3. hybrid intervention (creative designed by human marketing team after receiving creative from the LLM)
Experimental Design Details
Randomization Method
random number generator
Randomization Unit
individual customer identification number
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
N/A
Sample size: planned number of observations
approximately 27,000 customers
Sample size (or number of clusters) by treatment arms
9000 in each intervention arms
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
Using historical data, in order for WA to select the Human policy over the LLM policy, the bi-weekly treatment effect of the human writing team's NCs would need to be $0.5328 higher on a per-customer basis than the LLM. Assume the standard deviation in profits is the same in the human and LLM cells. Under symmetric asymptotic minimax regret (AMMR) in Joo and Chiong (2024), we find that to achieve the 5% probability of a type II error, the sample size such that the maximum possible regret does not exceed the realized regret, is 2,993 customers per cell.
Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
IRB

Institutional Review Boards (IRBs)

IRB Name
University of Chicago Social & Behavioral Sciences IRB
IRB Approval Date
2024-10-11
IRB Approval Number
IRB24-1675

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials