Primary Outcomes (explanation)
Our primary outcomes of interest include three groups of metrics:
1. Preference for personalized advertising: binary preference by platform, combined WTA for personalized advertising across all platforms, and out-of-range WTA indicator (if combined WTA > 25). In our design, both the binary preference by platform and the combined WTA across all platforms are incentive-compatible, whereas the platform-specific WTAs are stated-preference measures. We include the out-of-range WTA indicator because there is concern that WTAs outside the BDM price range are not incentive compatible. For the WTA, we will report both the raw level and the version winsorized at 25 for robustness.
2. Subjective welfare measures: ad relevance and distraction binary ratings in the pop-up questions, divided by the number of ads seen.
3. Objective welfare measures:
(a) Ad exposure frequency: (i) total number of ads seen per day; (ii) ratio of ads on visit duration; (iii) The following measures:
- Percentage of brand ads, product ads, or price ads.
- Average price level in price ads.
- Unique number of ads, repeat exposure
(b) Ad-attributed site visits: unique domain counts that originated from ads. We parse URL parameters (e.g., gclid, fbclid, utm_*) and HTTP referrers to identify clicks that originated from ads.
(c) Overall browsing duration by site category (overall, publisher websites, shopping websites, search engines, social media).
Since most browsing metrics have long right tails, we will normalize them relative to each participant’s Week 1 baseline as follows. For each metric M (e.g., minutes, unique domains, ads/day), let M_{i0} be participant i’s Week 1 average per active day and let M_{i1} be the corresponding average in Weeks 2–3. Our normalized outcome is the log change:
\widetilde M_i \equiv \log(1+M_{i1}) - \log(1+M_{i0}),
which is well-defined with zeros and can be interpreted approximately as a percent change for small changes.
Our first primary analysis aims to understand the causal effects of switching participants away from their status quo setting (i.e., “gaining experience”) and changing their preferences for personalized advertising. Define Preference for personalized ads as Y1 and the ad setting switch as E. Since the switch is not random, we adopt an IV specification to estimate the treatment-on-treated effect, using treatment assignments (T) as the instrument:
E_i = \pi_0 + \pi_1 T_i + X_i'\rho + u_i.\tag{1},
\Delta Y1_{i} = \beta_1 + \beta_2 \hat E_i + X_i'\beta_3 + \varepsilon_i,\tag{2}
|\Delta Y1_{i}| = \beta_4 + \beta_5 \hat E_i + X_i'\beta_6 + \varepsilon_i,\tag{3}
where \Delta Y1_i \equiv Y1_{i1} - Y1_{i0} is the change in ad setting preferences from baseline to endline; X_i are variables representing consumer characteristics, including baseline browsing intensity, demographics, and whether they previously have incorrect beliefs about their current ad settings. In Equation (2), we examine whether gaining experience in non-preferred settings shifts the average preferences for personalized ads. In Equation (3), we examine whether gaining experience leads to more frequent preference updates.
Our second primary analysis aims to understand the causal effect of personalized advertising on subjective and objective welfare measures. Define the welfare measures as Y2 and having the personalized ad setting on as D. Since having the personalized ad setting on is not random, we adopt an IV specification to estimate the treatment-on-treated effect.
Instrument definition: Our BDM design nudges participants toward their less-preferred setting, which means the treatment assignment has opposite effects on D depending on participants' baseline preferences. Specifically:
- For participants who prefer personalization OFF (i.e., their less-preferred setting is "personalized ads on"), the high-incentive nudge makes D=1 more likely.
- For participants who prefer personalization ON (i.e., their less-preferred setting is "personalized ads off"), the high-incentive nudge makes D=0 more likely.
This creates a potential violation of the standard IV monotonicity assumption, since the instrument does not push all individuals' D in the same direction. To address this, we adopt one of the following approaches:
Approach A (Separate regressions by baseline preference): We estimate separate IV regressions for each baseline preference group. Let B_i = 1 if participant i prefers personalized ads ON at baseline, and B_i = 0 otherwise. We estimate:
For participants with B_i = 0 (prefer personalization OFF at baseline):
D_i = \pi_2^{(0)} + \pi_3^{(0)} T_i + X_i'\rho^{(0)} + u_i,\tag{4a}
\Delta Y2_{i} = \gamma_1^{(0)} + \gamma_2^{(0)} \hat D_i + X_i'\gamma_3^{(0)} + \varepsilon_i,\tag{5a}
For participants with B_i = 1 (prefer personalization ON at baseline):
D_i = \pi_2^{(1)} + \pi_3^{(1)} T_i + X_i'\rho^{(1)} + u_i,\tag{4b}
\Delta Y2_{i} = \gamma_1^{(1)} + \gamma_2^{(1)} \hat D_i + X_i'\gamma_3^{(1)} + \varepsilon_i,\tag{5b}
Within each subgroup, the instrument satisfies monotonicity: T unambiguously increases D for the B_i=0 group and decreases D for the B_i=1 group. The coefficient \gamma_2^{(0)} captures the LATE of turning personalization ON among compliers who prefer it OFF, while \gamma_2^{(1)} captures the LATE of turning personalization ON among compliers who prefer it ON.
Approach B (Pooled regression with interaction terms): Alternatively, we estimate a pooled specification with separate first-stage instruments by baseline preference:
D_i = \pi_2 + \pi_3 (T_i \times B_i) + \pi_4 (T_i \times (1-B_i)) + \pi_5 B_i + X_i'\rho + u_i,\tag{4c}
\Delta Y2_{i} = \gamma_1 + \gamma_2 \hat D_i + X_i'\gamma_3 + \varepsilon_i,\tag{5c}
This approach uses two instruments (T_i \times B_i and T_i \times (1-B_i)) that push D in opposite directions, allowing the first stage to correctly capture how treatment affects personalization status for each group.
We prefer Approach A (separate regressions) because it yields a cleaner interpretation of causal effects and avoids the functional-form assumptions required for pooling. However, if Approach A lacks sufficient power (for example, if one preference group is much smaller in proportion), we fall back to Approach B as the primary specification.
For inference across multiple outcomes, we check robustness using Anderson q-values within each family of outcomes. When analyzing platforms separately, we treat each platform as its own family (i.e., adjustments are platform-specific).
Our third primary analysis examines the role of information friction/inertia in adhering to ad settings that differ from participants’ preferences. We define \Delta D_{i0} as the indicator that participants have switched to their baseline preferred setting during the baseline tracking period (week 1), and \Delta D_{i1} as the indicator that participants have switched to their baseline preferred setting during the intervention period (week 2-3; this variable takes value of 1 even if a participant switched at some time but switch back later), after they are aware how to switch ad settings on each platform. We then regress \Delta D_{i1} - \Delta D_{i0} on the information treatment indicator using the information treatment and control groups as the sample. The regression coefficient is the causal effect of information treatment on switching to participants' preferred settings.
In addition to the reduced-form models above, we plan to estimate a model of consumer ad-setting decisions to formally characterize the roles of experience and behavioral frictions. The model would allow us to simulate outcomes from counterfactual policies when consumers are provided with additional information/experience and when the behavioral friction is reduced.