AEA RCT Registry

Fields Changed

Registration

Field	Before	After
Trial Start Date	April 30, 2019	October 09, 2019
Trial End Date	December 31, 2019	February 29, 2020
Last Published	April 30, 2019 03:29 PM	October 11, 2019 12:57 AM
Intervention (Public)	We test the concept of narrow bracketing in deterministic choices over work, which are relevant to the labor market.	We test the concept of narrow and broad bracketing in deterministic choices over work, which are relevant to the labor market.
Intervention Start Date	April 30, 2019	October 09, 2019
Intervention End Date	December 31, 2019	February 29, 2020
Primary Outcomes (End Points)	Elicitation of the willingness to accept a payment in order to complete a task across different treatments. Thus the question is whether the framing as doing extra work 'before' rather than 'after' - while holding the actual consequences constant - leads to a change in willingness to work, which it cannot under any broadly framed theory. We will compare this to choices where we enforce broad bracketing, by making the actual change salient.	See october-2019-design.pdf for change in design. The following previous design description is included for completeness, but is NOT what we are currently planning on running. Elicitation of the willingness to accept a payment in order to complete a task across different treatments. Thus the questions are two, linked to the framing of doing extra work: the first concerns doing extra work after a (changing) fixed mandatory work; the second is whether the framing as doing extra work 'before' rather than 'after' - while holding the actual consequences constant - leads to a change in willingness to work, which it cannot under any broadly framed theory. We will compare this to choices where we enforce broad bracketing, by making the actual change salient.
Primary Outcomes (Explanation)	We will ask subjects at what price they will be willing to complete a task. We will elicit their choices in two different ways. There are 5 treatments (WTW stands for 'Willingness to Work'): - BEFORE ONLY: Subjects are asked for their WTW for additional tasks when there are no required tasks. - NARROW UNSPECIFIED: Subjects are asked for their WTW for additional tasks when they know there are required tasks. They are not told whether these tasks are done before or after the main tasks. - NARROW BEFORE: Subjects are asked for their WTW for additional tasks before doing some required tasks. - NARROW AFTER: Subjects are asked for their WTW for additional tasks after doing some required tasks. - BROAD: Subjects are asked for their WTW for additional tasks when it is made clear that they are in addition to the required tasks. The primary outcomes are the willingness to work for the different treatment groups. We have 3 main comparisons, plus 2 additional robustness checks. Our main hypotheses are the following: BEFORE ONLY =< NARROW BEFORE =< NARROW AFTER =< BROAD The additional hypotheses are: BEFORE ONLY =< NARROW UNSPECIFIED =< BROAD	See october-2019-design.pdf for change in design. The following previous design description is included for completeness, but is NOT what we are currently planning on running. In each treatment we will ask subjects to complete a required task and then elicit their willingness to complete extra tasks. We will compare in each treatment the total number of tasks performed.
Experimental Design (Public)	In a laboratory experiment, using a real effort task, we measure whether psychological factors affect the decisions to work extra time.	In an online experiment, using a real effort task, we measure whether psychological factors affect the decisions to work extra time.
Randomization Method	Randomization done throughout Mturk for the online experiment and the recruitment platform Orsee for the laboratory part.	Randomization done throughout Mturk.
Planned Number of Clusters	Initial run: 60 MTurkers, 30 per treatment (for preliminary results for conference; deadline May 1st 2019) Pilot: 120 MTurkers for the pilot, 40 per treatment (W10, W40, W70) Main: 700 MTurkers, 140 per main treatment (5 main treatments)	Experiment on MTurk.
Planned Number of Observations	880 MTurkers	450 for the one-day design. 2-day design needs fleshing out.
Sample size (or number of clusters) by treatment arms	60 subjects total (MTurk): 30 subjects for each of NARROW BEFORE and NARROW AFTER - Brief explanation (April 30th, 2019, 22PM CET): Ideally we would not run this yet, but due to a conference deadline we feel we need some preliminary results. 120 subjects total (MTurk): 40 subjects per group in W10, W40, and W70, with a low-effort (W10) medium-effort (W40) and high-effort (W70) group. Each of those groups is asked for their WTW after having done their required work, given by 10, 40, or 70 tasks. - Explanation: This is needed to test whether tasks done early are less tedious than when done later, which is an identifying assumption of ours. Since we won't ask in these to choose future work, nor compare any narrow choice with a broad choice, we cannot use this to bias our results. We also use this to find out whether the slider task is more precise or the price list. Whichever has the lower variance (higher precision) between the three effort-level groups. If they are too similar (that is, neither of them is particularly different) then we will go with a 60% slider, 40% price list split for the main groups. 140 subjects for each of the 5 main treatments treatments and 100 subjects in the within subjects treatment.	90 for each of the four main treatments, 45 for the minor treatments. See october-2019-design.pdf for details.
Power calculation: Minimum Detectable Effect Size for Main Outcomes	Based on an expected effect size d = 0.4 we assign 140 observations to each of the two treatments for our main comparison between NARROW BEFORE and NARROW AFTER. This gives us 90% power to detect the effect size at the 5% level of significance. Similar observations are considered for each of the other treatment comparisons.	Assuming a similar level of narrow bracketed individuals across treatments, for a given uniform distribution among the WTA the expected effect size between total task in the two NARROW treatments is d=1.10. As this may seems little conservative we plan to collect a total of 90 subjects per treatment, that would give us a 90% power to detect the effect size at the 5% level of significance for an effect size of 0.5. The same number of observations will be collected for the BROAD treatments, where we expect no difference in the total number of tasks. Based on an expected effect size d = 0.4 we assign 140 observations to each of the two treatments for our main comparison between NARROW BEFORE and NARROW AFTER, if we end up running this design (see october-2019-design.pdf). This would give us 90% power to detect the effect size at the 5% level of significance.
Intervention (Hidden)	We test whether decisions for extra work are narrowly bracketed even in the absence of uncertainty: whether people make decisions for extra work by thinking only about the direct disutility incurred from doing the extra work, or whether they also take into account the indirect effects of this extra work on other work they already have to complete. For example, the direct disutility of replying to emails is exactly the cost of writing and sending them; whereas the full cost includes indirect costs such as having to work later that day due to the time spent on replying to emails. Our main hypothesis is that people do more work when asked to do it ‘before’ rather than ‘after’ a given, predetermined amount of work. This follows a literature on narrow bracketing, but differs importantly in so far as it is not about a failure to combine different lotteries (something that might arguably hard in some cases), but simply in forgetting or not paying attention to the time and tiredness incurred from doing the current work. Although the total amount of work requested is the same under two conditions, a person who narrowly brackets may nonetheless act differently, since the extra tasks done may be perceived differently if done before or after the required work. The experiment will be based on a transcription task similar to the one used by Augenblick and Rabin (2015). The particular feature of this task is that learning will be minimized. In a separate experiment we will test subjects' perception of the task toughness over time. Together with the test in effort choices, we will also link the experimental choices to the standard findings of narrow bracketing in risky choices, to see if the two correlate.	See october-2019-design.pdf for change in design. The following previous design description is included for completeness, but is NOT what we are currently planning on running. We test whether decisions for extra work are narrowly bracketed even in the absence of uncertainty: whether people make decisions for extra work by thinking only about the direct disutility incurred from doing the extra work, or whether they also take into account the indirect effects of this extra work on other work they already have to complete. For example, the direct disutility of replying to emails is exactly the cost of writing and sending them; whereas the full cost includes indirect costs such as having to work later that day due to the time spent on replying to emails. Our main hypothesis is that people do more work when asked to do it ‘before’ rather than ‘after’ a given, predetermined amount of work. This follows a literature on narrow bracketing, but differs importantly in so far as it is not about a failure to combine different lotteries (something that might arguably hard in some cases), but simply in forgetting or not paying attention to the time and tiredness incurred from doing the current work. Although the total amount of work requested is the same under two conditions, a person who narrowly brackets may nonetheless act differently, since the extra tasks done may be perceived differently if done before or after the required work. The experiment will be based on a transcription task similar to the one used by Augenblick and Rabin (2015). The particular feature of this task is that learning will be minimized. In a separate experiment we will test subjects' perception of the task toughness over time. Together with the test in effort choices, we will also link the experimental choices to the standard findings of narrow bracketing in risky choices, to see if the two correlate.
Secondary Outcomes (End Points)	We want to measure whether there is a correlation between subject's level of narrow bracketing in deterministic work choices and narrow bracketing in risky choices; whether there is more narrow bracketing when the metrics for the extra work is different from the metric for the main work (that is, it is expressed as a piece-rate, $0.40 per task, rather than $4 for doing 10 tasks), compared to when the metric is the same. A further analysis will be done on an extra within subjects treatment, where we use questions from two treatments. We will test whether people make the same mistake when they see both choices, controlling for an order effect.	See october-2019-design.pdf for change in design. The following previous design description is included for completeness, but is NOT what we are currently planning on running. We want to measure whether there is a correlation between subject's level of narrow bracketing in deterministic work choices and narrow bracketing in risky choices. We will test whether people make the same mistake when they see both choices, controlling for an order effect.
Secondary Outcomes (Explanation)	It may be that people bracket narrowly, but not if they see the broadly bracketed version first. Thus a person who is asked for their WTW for 40 tasks rather than 30, and then asked for their willingness to do 10 tasks before doing the 30, may realize that these questions are the same, and thus broadly bracket the second question. If asked first for their WTW for 10 tasks before 30, and then for their WTW 40 rather than 30, their answer to the "10 before 30" may be different because they did not realize that it is about doing 40 rather than 30 tasks. Thus we want to measure whether the same question leads to different answers depending when people are asked the question. Since one concern is that people may either use heuristics to make decisions faster ("This is 10 extra tasks, so I'll give the same answer as before") or want to be consistent with their past choices once they realize they are the same ("Oh, 40 vs 30 tasks is the same as my previous answer, I should give the same answer") rather than admit they might have gotten it wrong (Augenblick and Rabin (2018) do find that this effect is quite strong in their experiment, when subjects are reminded of their past choice) this will not cleany establish which choice people think is a mistake, but together with the between-subjects design it should shed light on it. Ignoring these other concerns (heuristics, desire for consistency), we will use these answers to create a measure of narrow bracketing at the individual level: the degree to which the BROAD answer is different from the NARROW answer, and we'll do so accounting for order effects. The reason for testing correlation between individual-level narrow bracketing in our context and in risky choices (based on our within-subjects treatment) is straightforward: we want to see if there are people who are more likely to narrow bracket in different types of settings.	See october-2019-design.pdf for change in design. The following previous design description is included for completeness, but is NOT what we are currently planning on running. It may be that people bracket narrowly, but not if they see the broadly bracketed version first. Thus a person who is asked for their WTW for 20 tasks rather than 10, and then asked for their willingness to do 10 tasks before doing the 20, may realize that these questions are the same, and thus broadly bracket the second question. If asked first for their WTW for 10 tasks before 10, and then for their WTW 20 rather than 10, their answer to the "10 before 10" may be different because they did not realize that it is about doing 20 rather than 10 tasks. Thus we want to measure whether the same question leads to different answers depending when people are asked the question. Since one concern is that people may either use heuristics to make decisions faster ("This is 10 extra tasks, so I'll give the same answer as before") or want to be consistent with their past choices once they realize they are the same ("Oh, 20 vs 10 tasks is the same as my previous answer, I should give the same answer") rather than admit they might have gotten it wrong (Augenblick and Rabin (2018) do find that this effect is quite strong in their experiment, when subjects are reminded of their past choice) this will not cleany establish which choice people think is a mistake, but together with the between-subjects design it should shed light on it. Ignoring these other concerns (heuristics, desire for consistency), we will use these answers to create a measure of narrow bracketing at the individual level: the degree to which the BROAD answer is different from the NARROW answer, and we'll do so accounting for order effects. The reason for testing correlation between individual-level narrow bracketing in our context and in risky choices (based on our within-subjects treatment) is straightforward: we want to see if there are people who are more likely to narrow bracket in different types of settings.