Experimental Design Details
Experiments on self-stereotyping usually compare the outcome of interest across settings that vary in their gender-stereotype. For example, Kamas and Preston (2012) compare competitive behavior of women and men in verbal (female-typed) versus math (male-typed) tasks. Coffman (2014) runs a lab experiment in which participants take part in quizzes across various domains that differ in their gender-stereotype. Hence, in these studies the variation in the gender-stereotype is exogenously induced by changing the task itself. While this is clearly effective in changing the task stereotype, it can cause omitted variable bias if the change in the task not only shifts the stereotype but also other task characteristics that are correlated with both the stereotype and the outcome of interest.
To avoid this problem of omitted variables in my experiment, I change the task stereotype while holding the task itself fixed. The main idea for this experiment is based on an unpublished manuscript by Barron et al. (2023), but the concrete experimental design is adjusted compared to their implementation. Consider a task in which participants have to logically complete a pattern. While the task itself remains unchanged, I present the task as a male-typed “Analytical Task” in one treatment condition and as a female-typed “Creative Patterns Task” in another treatment condition. To do so, I not only vary the task name but also the color and font in which the task name is written as well as a small task logo that is displayed next to the task name. I also vary the colors in which the answer options are framed. Both framings seem credible given the nature of the task as a test of logical thinking on the one hand, and the involvement of shapes and patterns on the other. Hence, while the task is exactly the same in both versions, the gender-stereotype assigned to the task changes across framings. I have pre-tested these framings to make sure they successfully shift task perception. They do: Survey participants think that most people evaluate the task as more feminine (masculine), believe that women (men) perform better in the task and believe that women (men) enjoy working on the task more if the task is female-framed (male-framed).
In the experiment, the treatment consists of the random assignment of experimental subjects to either the male or the female task framing. Before the main part of the experiment starts, I elicit consent as well as gender and age of the participants. My study focuses on 18 - 25 year old, cis-gender individuals. I will now explain the course of the experiment in more detail.
In part 1 of the experiment, I measure the degree to which participants consider their assigned (framed) task as stereotypically male or female. I elicit participants’ second-order beliefs on the task on three dimensions: whether the task is masculine vs. feminine, whether men or women perform better in the task and whether men or women enjoy working on the task more. Note that each study subject evaluates only the task presented with the framing they are randomly assigned to. To elicit these evaluations, I use the method of Krupka and Weber (2013): I measure respondents’ second-order beliefs (i.e., their beliefs about others’ evaluations) which allows to incentivize answers. Eliciting task perception is important to establish that the different framings actually shift perceived task gender-stereotypes. I have pre-tested these framings to make sure they do.
In the next part, experimental subjects are asked to work on the assigned task for one round (a round consists of 8 different matrix problems; I refer to the first round as the baseline work round). Task completion is incentivized by performance-based pay. After finishing the baseline work round, subjects are asked how much they enjoyed working on the task, how well they think they have performed and how confident they feel about their answers in the baseline round of the matrix task. Participants then receive perfect feedback on their baseline performance: They learn exactly how many matrix problems they have solved correctly in the baseline work round. I make sure
participants pay attention to the feedback provided by asking them to report back the number of matrix problems they have solved correctly at baseline. After the provision of feedback, I measure individuals’ expected task enjoyment if they were to work on the task a second time. I also elicit participants’ beliefs about their future performance in the task. Given the perfect feedback on baseline task performance, subjects can form well-founded beliefs about their ability in the task. I incentivize belief-elicitation by providing a potential bonus payment if beliefs match future task performance.
In the third part of the study, participants choose the task they will work on in a second work round. Subjects can choose between the matrix task they have already performed in the baseline work round (with problems that are different but similar in style and difficulty to the baseline round) and a neutral outside option task. The outside option task consists of clicking on a specific letter out of a set of letters shown on the screen. If chosen, subjects have to perform this task for a fixed amount of time and receive a fixed payment for it. I use a multiple price list (MPL) in which subjects have to decide across decision rows which of the two tasks they want to complete in work round 2 to earn additional payment. Across the rows of the MPL, the piece-rate earned for each correctly solved matrix problem is held constant at the same rate as in the baseline round while the fixed payment earned for performing the outside option task varies. One row of the MPL is randomly implemented to determine the task a subject works on in work round 2 to incentivize truthful answers. I also elicit motives driving the MPL choices via an open text field question.
In the final part of the experiment, I measure character traits (risk aversion, self-confidence, social desirability, importance of being popular, the degree of fitting in, challenge seeking, a measure of continuous gender and Big Five) and demographic information (education, field of
study/occupational industry and the number of siblings) for heterogeneity analyses. I also measure gender norms by eliciting participants’ level of agreement to a statement on equally sharing household and market work between a man and a woman within a household. Lastly, I elicit subjects perception on how challenging the matrix task is and ask them what they believe the task is measuring.