Above and Beyond: Experimental Evidence on Gender Differences in Quality of Output
Last registered on November 12, 2020


Initial registration date
November 11, 2020
Last updated
November 12, 2020 8:17 AM EST
Primary Investigator
Gothenburg University
PI Affiliation
University of Exeter
Start date
This project will use an experimental task to study the quality of output and effective earnings, and whether there are gender differences.
Registration Citation
Ip, Edwin and Joe Vecci. 2020. "Above and Beyond: Experimental Evidence on Gender Differences in Quality of Output." AEA RCT Registry. November 12. https://doi.org/10.1257/rct.6731-1.0.
Primary Outcomes (end points)
In the initial treatments, we examine the following main outcomes:
• Quality (closeness to specific position)
• Time spent
We will also examine these outcomes by Gender.

We will measure treatment differences in these two main outcomes (as well as by these outcomes by gender).
As this experiment is carried out online unsupervised, we may not be able to completely prevent activities that affect the interpretation of the results. Thus, we establish a few exclusion criteria:

- Subjects who take longer than 24 seconds per slider on average without being almost perfect
- Subjects who reported technical issues during the task
- Subjects who failed attention checks during the study
- Subjects who fail Qualtric’s response quality criteria
- If subjects takes less time to complete the ability – specific task than the ability -range task. (see Secondary Outcomes (explanation))
- Subjects who complete the experiment on small screens (e.g. smartphone and tablets)
Secondary Outcomes (end points)
We will examine the following controls and heterogeneity:
• Gender
• Age
• Multidimensional Perfectionism Scale – see below
• Experimenter Demand see below
• Unincentivised questions on Social Image, Inequality Aversion (unincentivized dictator question), Risk Preference (see German Socio-Economic Panel GSOEP) and Alturism (see Falk et al, 2017), Trust (see GSOEP), Positive Reciprocity (see GSOEP), Big 5 (Rammstedt et al 2007). Where applicable, we follow standard practice (see Falk et al, 2017 or GSOEP) and combine the relevant questions to create an index.
• Enjoyment and Difficulty of slider task.
• Any technical issues experienced.
• Ability – see below.
• Wage if employed.
• Questions on behaviour at work.
Secondary Outcomes (explanation)
• Multidimensional Perfectionism Scale- 8 item will be taken from Frost, 1990
• Experimenter Demand: To investigate experimenter demand we will ask subjects what they think the experiment was about
• Ability: we use two tasks to measure three things. In the first task (“ability – specific”), we ask subjects to move 8 sliders to a specific position as quickly as they can. In the second task (“ability – range”), we ask subjects to move 8 sliders to a range of number as quickly as they can. The faster they do it, the more likely it will be for them to earn an extra payment.

The first measurement (from the first task) is how long it takes a subject to move sliders to a specific position (i.e. carrying out the slider task perfectly). The second measurement (from the second task) is how long it takes a subject to move sliders to a range without a specific target. The third measurement is a subject’s baseline tendency to move towards the middle of the range. We do this by measuring how close subjects get to the middle of the range in the second task, when there is no intrinsic motivation to do so. Note that in the main experimental task and in this second task are equivalent from the extrinsic motivation point of view. We will randomize the order in which these two tasks are performed for each subject.
Experimental Design
The task involves a slider task similar to Gill and Prowse (2012) where subjects are asked to move a series of sliders to a target position, with the difference being that it is acceptable for the subjects to move the slider to within 5 of the target position
Experimental Design Details
We invite equal number of males and females to take part in the 2 treatments in a between-subject design. In addition, all subjects (in both treatments) will take part in 2 tasks that measure their ability.

Comparing Treatments 1 and 2 will allow us to measure the effect (and gendered effect) of other-regarding preferences on the quality of output and time spent [between subjects]. Comparing equivalent metrics in the ability – range task and treatment 1 will allow us to measure the effect of intrinsic motivation on the quality of work [within subjects]. In addition, we compare outcomes across genders to determine whether there are any gender differences. We will also control for perfectionism as a factor determining the results.
Randomization Method
The experiment will be conducted using the Qualtrics software, this software will randomize subjects into a treatment so that there are an equal number of subjects in each treatment.
Randomization Unit
Randomization will be at the individual level.
Was the treatment clustered?
We plan to sample an equal number of males and females. We will collect 80 observations for both males and females in each of the two treatments. The sample will consist of subjects based in the United States.
We expect a MDE of 0.314
University of Exeter
