AEA RCT Registry

Fields Changed

Registration

Field	Before	After
Trial Start Date	February 10, 2019	April 30, 2019
Last Published	January 25, 2019 03:31 AM	April 30, 2019 03:29 PM
Intervention Start Date	February 10, 2019	April 30, 2019
Primary Outcomes (End Points)	Elicitation of the willingness to accept a payment in order to complete a task across different treatments. Thus the question is whether the framing as doing extra work 'before' rather than 'after' - while holding the actual consequences constant - leads to a change in willingness to work, which it cannot under any broadly framed theory.	Elicitation of the willingness to accept a payment in order to complete a task across different treatments. Thus the question is whether the framing as doing extra work 'before' rather than 'after' - while holding the actual consequences constant - leads to a change in willingness to work, which it cannot under any broadly framed theory. We will compare this to choices where we enforce broad bracketing, by making the actual change salient.
Primary Outcomes (Explanation)	We will ask subjects at what price they will be willing to complete a task, based on a piece-rate payment. We will elicit their choices in two different ways.	We will ask subjects at what price they will be willing to complete a task. We will elicit their choices in two different ways. There are 5 treatments (WTW stands for 'Willingness to Work'): - BEFORE ONLY: Subjects are asked for their WTW for additional tasks when there are no required tasks. - NARROW UNSPECIFIED: Subjects are asked for their WTW for additional tasks when they know there are required tasks. They are not told whether these tasks are done before or after the main tasks. - NARROW BEFORE: Subjects are asked for their WTW for additional tasks before doing some required tasks. - NARROW AFTER: Subjects are asked for their WTW for additional tasks after doing some required tasks. - BROAD: Subjects are asked for their WTW for additional tasks when it is made clear that they are in addition to the required tasks. The primary outcomes are the willingness to work for the different treatment groups. We have 3 main comparisons, plus 2 additional robustness checks. Our main hypotheses are the following: BEFORE ONLY =< NARROW BEFORE =< NARROW AFTER =< BROAD The additional hypotheses are: BEFORE ONLY =< NARROW UNSPECIFIED =< BROAD
Randomization Method	Randomization done throughout the recruitment platform Orsee on a students' subject pool from the University of Luxembourg.	Randomization done throughout Mturk for the online experiment and the recruitment platform Orsee for the laboratory part.
Planned Number of Clusters	380 student subjects recruited throughout orsee (Greiner, 2015), 280 for the two main between-subjects treatments and 100 for the within-subjects treatment. We will initially conduct the two main treatments. The within-subjects will be built as a follow up.	Initial run: 60 MTurkers, 30 per treatment (for preliminary results for conference; deadline May 1st 2019) Pilot: 120 MTurkers for the pilot, 40 per treatment (W10, W40, W70) Main: 700 MTurkers, 140 per main treatment (5 main treatments)
Planned Number of Observations	380 student subjects recruited throughout orsee (Greiner, 2015)	880 MTurkers
Sample size (or number of clusters) by treatment arms	140 subjects for each of the two between subjects treatments (80 with the slider elicitation methods and 60 with the price list) and 100 subjects in the within subjects treatment.	60 subjects total (MTurk): 30 subjects for each of NARROW BEFORE and NARROW AFTER - Brief explanation (April 30th, 2019, 22PM CET): Ideally we would not run this yet, but due to a conference deadline we feel we need some preliminary results. 120 subjects total (MTurk): 40 subjects per group in W10, W40, and W70, with a low-effort (W10) medium-effort (W40) and high-effort (W70) group. Each of those groups is asked for their WTW after having done their required work, given by 10, 40, or 70 tasks. - Explanation: This is needed to test whether tasks done early are less tedious than when done later, which is an identifying assumption of ours. Since we won't ask in these to choose future work, nor compare any narrow choice with a broad choice, we cannot use this to bias our results. We also use this to find out whether the slider task is more precise or the price list. Whichever has the lower variance (higher precision) between the three effort-level groups. If they are too similar (that is, neither of them is particularly different) then we will go with a 60% slider, 40% price list split for the main groups. 140 subjects for each of the 5 main treatments treatments and 100 subjects in the within subjects treatment.
Power calculation: Minimum Detectable Effect Size for Main Outcomes	Based on an expected effect size d = 0.4 we assign 140 observations to each of the two treatments. This gives us 90% power to detect the effect size at the 5% level of significance.	Based on an expected effect size d = 0.4 we assign 140 observations to each of the two treatments for our main comparison between NARROW BEFORE and NARROW AFTER. This gives us 90% power to detect the effect size at the 5% level of significance. Similar observations are considered for each of the other treatment comparisons.
Secondary Outcomes (End Points)	We want to measure whether there is a correlation between subject's level of narrow bracketing in deterministic work choices and narrow bracketing in risky choices; whether there is more narrow bracketing when the metrics for the extra work is different from the metric for the main work (that is, it is expressed as a piece-rate, $0.40 per task, rather than $4 for doing 10 tasks), compared to when the metric is the same. A further analysis will be done on an extra within subjects treatment, where both before/after choices will be proposed. We will test whether people make the mistake when they see both choices, controlling for an order effect.	We want to measure whether there is a correlation between subject's level of narrow bracketing in deterministic work choices and narrow bracketing in risky choices; whether there is more narrow bracketing when the metrics for the extra work is different from the metric for the main work (that is, it is expressed as a piece-rate, $0.40 per task, rather than $4 for doing 10 tasks), compared to when the metric is the same. A further analysis will be done on an extra within subjects treatment, where we use questions from two treatments. We will test whether people make the same mistake when they see both choices, controlling for an order effect.
Secondary Outcomes (Explanation)		It may be that people bracket narrowly, but not if they see the broadly bracketed version first. Thus a person who is asked for their WTW for 40 tasks rather than 30, and then asked for their willingness to do 10 tasks before doing the 30, may realize that these questions are the same, and thus broadly bracket the second question. If asked first for their WTW for 10 tasks before 30, and then for their WTW 40 rather than 30, their answer to the "10 before 30" may be different because they did not realize that it is about doing 40 rather than 30 tasks. Thus we want to measure whether the same question leads to different answers depending when people are asked the question. Since one concern is that people may either use heuristics to make decisions faster ("This is 10 extra tasks, so I'll give the same answer as before") or want to be consistent with their past choices once they realize they are the same ("Oh, 40 vs 30 tasks is the same as my previous answer, I should give the same answer") rather than admit they might have gotten it wrong (Augenblick and Rabin (2018) do find that this effect is quite strong in their experiment, when subjects are reminded of their past choice) this will not cleany establish which choice people think is a mistake, but together with the between-subjects design it should shed light on it. Ignoring these other concerns (heuristics, desire for consistency), we will use these answers to create a measure of narrow bracketing at the individual level: the degree to which the BROAD answer is different from the NARROW answer, and we'll do so accounting for order effects. The reason for testing correlation between individual-level narrow bracketing in our context and in risky choices (based on our within-subjects treatment) is straightforward: we want to see if there are people who are more likely to narrow bracket in different types of settings.