The experiment consisted of five parts: recruitment and pre-screen, baseline, app-treatment survey, midline, and endline. It was administered through four surveys on the Qualtrics platform, over a period of 6 weeks from January 20th to March 3rd 2019. The study period was intentionally timed to avoid any major holidays that might impact smartphone and social media usage.
I recruited participants over Facebook ads over the course of the first day of the experiment. The ad was targeted at users in the United States aged 18-34 -- the age range of the majority of Facebook users (eMarketer, 2018). In order to achieve a gender-balanced initial sample, I additionally targeted the ads by demographic cells, allocating triple the recruitment budget to recruiting males because females are roughly three to four times more likely to click-through on ads. During the campaign, 223,104 people were shown the ad and 3,617 people clicked on it, achieving a click-through rate of approximately 1.6%. The ad consisted of a generic picture of Stanford University and was captioned: ``Participate in a Stanford online research study and earn $20! Contribute to economic research at Stanford by participating in an online study on social media use." It made no mention of ``Screen Time" or self-control issues in order to minimise priming and sample selection bias.
Upon clicking on the ad, participants entered a background demographic pre-screen hosted on stanforduniversity.qualtrics.com. I screened participants for: (i) active usage of the Facebook platform on their mobile devices (more than 50% of Facebook usage spent on mobile, as well as more than 50% of that time spent within the Facebook app as opposed to web-based browsers); (ii) usage of iOS devices, as the system contains the ``Screen Time" feature, which is useful towards harnessing phone usage data; and (iii) age (above 18). A consent form was also shown as part of the pre-screen. 629 participants eventually consented to the study, passed the pre-screen, and completed the entirety of the baseline survey.
After the pre-screen, qualified participants were immediately prompted into the baseline survey. Contact information, additional demographics, and a range of outcome variables were recorded. During the baseline survey, I asked respondents to estimate the time they and their peers spent on their phones (Facebook/Instagram), before asking them to enter into their ``Screen Time" feature to access their time usage data. As is the case with every survey, participants were then asked to upload a screenshot of their phone, Facebook, and Instagram time usage for the last 7 days, if available. Of the 629 participants, 81% already had ``Screen Time" enabled on their phones, and the rest had yet to enable ``Screen Time". Since ``Screen Time" is only available on iOS 12.0 or later, I ask participants with earlier versions of the system to update their phones so they can provide ``Screen Time" data the following week. I also prompt participants to enable ``Screen Time" on their phones so as to facilitate data collection in subsequent surveys.
All baseline participants then answered questions about their ideal usage and predicted usage for the following week. Finally, respondents were asked to answer a series of questions traditionally used to measure ``self-control": the 13 questions of Tangney's Brief Self-Control Scale were displayed, as well as three questions aimed at eliciting participants' time-discounting parameters. The latter questions are hypothetical (e.g. ``If we paid you in one month, what's the lowest amount that you would be willing to accept, instead of receiving $\$20$ today?"). I considered implementing a second-price auction or other technique to elicit preferences in an incentivised manner; however, asking questions hypothetically is sufficient for my purposes as it tends to be less confusing than other techniques and operates without the added cost of having to pay participants.
629 participants eventually completed the baseline. Of those, 52 responses on time usage were either manually overridden or omitted from the experimental data on ``Screen Time" usage because of invalid data (for example, a series of `0's for time usage due to recent activation of Screen Time or a system glitch) or inaccurate responses (for example, discrepancy between ``Screen Time" screenshots and reported survey data). Where possible, I overrode inaccurate survey responses with accurate ``Screen Time" data harvested from participants' screenshots.
I then stratified participants into the control or app-intervention group for the second survey, which was administered via email exactly a week after the baseline on January 27th. Again, the survey was hosted on the Qualtrics platform and each individual received a unique link to the survey. I sorted participants into control and treatment groups within 16 strata defined by age, gender, education, and active Instagram usage. To minimise differential attrition between the treatment and control arms, control participants were still asked about their time usage data and other related questions to maintain similar survey length.
As with the previous survey, screenshotted time usage data was collected for the previous 7 days, and all respondents again answered questions about their ideal usage and predicted usage for the following week. For clarity, I call actual data collected from screenshots in the second survey ``Week 1 Actual" data. Note that questions about ideal and predicted usage answered in the second survey correspond to anticipations for the following week; thus, for expositional purposes, I refer to them loosely as ``Week 2 Ideal" and ``Week 2 Predicted" data.
I then nudged participants in the app-treatment group that did not already have existing app limits on their phones to adopt time limits equivalent (or lower than) the ideal times they specified for their phone, Facebook, and Instagram previously. I informed them that iOS 12's "Screen Time" had a feature called "App Limits" that allows them to set time limits for apps.
17 participants reported that they already had app limits adopted on their phones, whilst 83 participants declined the suggestion to adopt app limits. Setting a time limit on the iOS system is not binding: the option exists to ``Ignore Limit" for the next 15 minutes, or for the rest of the day. Respondents were made aware of this fact when they are asked to adopt time limits, and it was made clear that they did not have to agree to the adoption of limits in order to continue on with the study. Additionally, they were asked to provide screenshots of the app limits they had set, in order to hold them accountable to the limits. I double-checked a subsample of these screenshots and found that the majority of individuals set their app limits to be equivalent to the ideal times they had previously specified for their phone, Facebook, and Instagram.
In the third survey administered the following week on February 3rd, participants were further stratified by demographic characteristics into the privacy intervention groups. Basic screenshots of ``Screen Time" data for the previous 7 days (``Week 2 Actual") were collected as in previous surveys, and participants that were allocated into the app treatment were asked follow-up questions about whether they had ignored or switched off any of their app limits in the past week. Participants allocated to the privacy treatment were first asked about their current perception of the privacy settings of their own Facebook account through questions that can be easily mapped onto specific Facebook privacy settings. Secondly, I elicited their self-reported preferred settings by posing ``simulated actual" questions that bear resemblance in phrasing to actual privacy options on the Facebook platform. I phrased the questions in this manner in order to achieve as close as an ``active choice" as possible: as participants inevitably already had an incumbent privacy setting, it was not possible to elicit a real ``active choice" independent of status quo bias. As such, asking for self-reported preferences in a simulated manner before participants entered into their real Facebook interface was the best way to elicit privacy preferences without introducing the distortions of status-quo bias. Thirdly, respondents were then asked to log into their Facebook account and report their actual privacy and ad settings. Again, they were asked to screenshot their settings to ensure truthful reporting. Lastly, to identify if there is any preference reversal owing to attention bias or other factors, participants were asked for their self-reported preferred settings again after having reported their actual settings, in exactly the same manner as before. They were asked if their preferences over their privacy settings had changed, and if so, to qualitatively describe how their preferences had changed.
In order to measure the longer-term impact of the app treatment on participants, the endline was administered four weeks after the third survey, on March 3rd. Similar to the third survey, basic ``Screen Time" screenshot and time usage data was collected, and participants that were allocated into the app treatment were asked follow-up questions about whether they had ignored or switched off the app limits in the past week. Participants that were not previously allocated to the privacy treatment were then asked questions about their privacy settings that were identical to those in the privacy treatment group received in the previous survey. Respondents in the privacy treatment group were followed up on whether they had changed their privacy settings since the last survey. Additionally, all participants were asked if they (or their friends) had experienced any privacy violations on social media in the past.