Time Pressure on Tests and Gender Gaps in STEM

Last registered on September 15, 2023

Pre-Trial

Trial Information

General Information

Title
Time Pressure on Tests and Gender Gaps in STEM
RCT ID
AEARCTR-0012104
Initial registration date
September 12, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
September 15, 2023, 9:00 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
Bentley University

Other Primary Investigator(s)

PI Affiliation
UC San Diego
PI Affiliation
UC San Diego
PI Affiliation
University of Chicago

Additional Trial Information

Status
On going
Start date
2023-09-04
End date
2024-05-31
Secondary IDs
Prior work
This trial is based on or builds upon one or more prior RCTs.
Abstract
This study will conduct an experiment investigating if time pressure on important standardized tests that are used in admissions decisions (the SAT math test and the GRE Quantitative Reasoning test) has differential impacts by gender. If so, tight time constraints may serve as an artificial barrier that prevents qualified women from advancing in STEM careers.

Student subjects take exams under two different time conditions: the normal time allowed on the exam, and 50 percent extended time. We then estimate whether easing time pressure has a bigger impact on the test scores of women than men, and see how easing time pressure affects how men and women rank on the test, particularly in the upper part of the distribution.
External Link(s)

Registration Citation

Citation
Gneezy, Uri et al. 2023. "Time Pressure on Tests and Gender Gaps in STEM." AEA RCT Registry. September 15. https://doi.org/10.1257/rct.12104-1.0
Experimental Details

Interventions

Intervention(s)
All subjects take a full section of a GRE Quantitative Reasoning (GRE-Q) test. The test consists of 25 questions. There are two treatment conditions:

1. The test is taken with the normal time allowed for a similar section on the actual GRE-Q, which is 40 minutes.
2. The test is taken with extended time granted to students needing accomodations for a similar section on the actual GRE-Q, which is 60 minutes.
Intervention Start Date
2023-09-12
Intervention End Date
2024-05-31

Primary Outcomes

Primary Outcomes (end points)
raw and standardized (within sample) scores on the test
score ranking on the test (within sample)
Primary Outcomes (explanation)
standardized score on the test is constructed by subtracting the within-sample mean score and dividing by the within-sample standard deviation)

ranking is calculated by calculating each student's within-sample ranking on the test. This is done separately for observations generated under each time condition.

Secondary Outcomes

Secondary Outcomes (end points)
Students are given a survey after taking the test that asks several questions about theeir motivation to do well on the test, pressure they felt when taking the test, and whether they felt they had enough time to complete the test. Each of these was answered on a 5-point Likert scale. We will explore how treatment (easing time pressure) affected each of these variables, and will control for them in some specifications of regressions with the primary outcomes listed above as the dependent variables.
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
The design is between-subject. Students are randomized into one of the two treatment conditions described above by first blocking on gender, then randomizing into the treatment groups, and checking on the following:
1. how score on an 8 question GRE-Q pretest is balanced across the two treatments
2. how score on this pretest is balanced across the two treatments for women in the sample
3. how score on this pretest is balanced across the two treatments for men in the sample

We then rerandomize until the p-value of each of these differences is greater than 0.9.

Students are recruited by recruiting STEM major class sections to participate. The experiment is conducted in a class session in order to maximize attendance.

The exam is administered in Qualtrics. Students are emailed the link to the test that has the correct time allotted for the treatment condition to which they have been assigned. It is taken on their own laptops in their normal classroom. Each student is provided with a calculator and scratch paper for use in the exam. The Qualtrics instrument automatically ends the test when the allotted time is expired. The test page has a timer which shows the amount of time remaining that is visible in the upper left corner of the student's screen throughout the test.

Students were paid electronically when possible via services such as Venmo and Zelle. Students who have no electronic payment method available and cannot create an account are paid in cash. Before the test, students enter their account information. They are paid immediately after the conclusion of their session.

After the test, students are given a short survey that asks questions about the level of motivation and pressure they felt while taking the test.
Experimental Design Details
Not available
Randomization Method
Students are randomized into one of the two treatment conditions described above by first blocking on gender, then randomizing into the treatment groups, and checking on the following:
1. how score on an 8 question GRE-Q pretest is balanced across the two treatments
2. how score on this pretest is balanced across the two treatments for women in the sample
3. how score on this pretest is balanced across the two treatments for men in the sample

We then rerandomize until the p-value of each of these differences is greater than 0.9.
Randomization Unit
Students are randomized into treatment conditions at the individual level.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
n/a
Sample size: planned number of observations
128 in the initial wave for experiment 3 described in the attached preanalysis plan. We hope to generate 500 total observations.
Sample size (or number of clusters) by treatment arms
250 per condition
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
This sample size allows us to detect a difference in treatment effect on raw score (out of 25) of easing time pressure by gender of approximately 2 with roughly 80 percent power. This is calculated by simulation for the interaction effect regression discussed in the preanalysis plan assuming a sample of 50 percent men and women, a mean baseline raw score of 13, and a standard deviation of 4.
IRB

Institutional Review Boards (IRBs)

IRB Name
Bentley University
IRB Approval Date
2022-05-06
IRB Approval Number
5062262
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information