Sample size: planned number of observations
We have a set of emails available for over 10,000 GitHub users who enrolled in the GitHub Sponsorship Program, and another 34,000 people who were active GitHub users but did not enroll in the GitHub sponsorship program. Their email addresses were extracted from publicly available data from GitHub. This puts our set of users we intend to share our survey at 44,238 individuals.
However, not all of the 44,238 individuals are expected to respond. To estimate of our expected number of responses, we randomly sampled 1000 emails from our dataset and emailed our survey to these individuals. After a week, we had 17 responses in total. Then, we sent a final reminder, and increased our total number of responses to 36.
Since all our questions were optional, for our estimated response sizes, we focused on the respondents who answered the primary outcome questions. We see that 16-32 people answered our primary outcome questions. Based on this range, we created two estimates of our expected number of observations: [a] an optimistic estimate, and [b] a pessimistic estimate.
[a] Optimistic Estimate = (32/1000)*(44238-1000) = 1384 individuals
[b] Pessimistic Estimate = (16/1000)*(44238-1000) = 692 individuals
We plan to continue the approach of emailing individuals at 9 AM local time on a weekday, and then sending a final reminder email to those who did not respond a week after our initial email. Due to the large size of the set, we plan to randomly split the 43,238 users we plan to reach out to into 4 batches. We will then email our users in the following staggered approach:
[Thursday; First week] - Initial email sent out to first batch of 10,809 people
[Thursday; Second week] - Initial email sent out to second batch of 10,810 people; Reminder email sent to the first batch
[Thursday; Third week] - Initial email sent out to third batch of 10,809 people; Reminder email sent to the second batch
[Thursday; Fourth week] - Initial email sent out to fourth batch of 10,810 people; Reminder email sent to the third batch
[Thursday; Fifth week] - Reminder email sent to the fourth batch
[Thursday; Sixth week] - We use the responses collected as of this date for our analysis