Experimental Design Details
Our next objective is to analyze information silos quantitatively. We will follow procedures from \cite{allcott2022digital} to deploy the study and \cite{zhang2020frontiers} to analyze behavioral data.
In this initial phase, we will establish screening criteria to ensure participants are regular users of TikTok. Interested individuals will be recruited through Prolific. Each participant will then complete a comprehensive demographic survey, collecting information such as age, gender, residence (urban or rural), education level, income, and political ideology. To understand participants’ exposure to news and information, we will collect additional information in the survey, including travel frequency and main source of information, as consumers may access broader information if they travel more often, and learn information from the Internet instead of newspapers or televisions.
Following recruitment, we will enter Stage 1, which focuses on screen recording. Participants will receive a detailed guide outlining how to record their TikTok browsing behavior, including instructions for screen recording and uploading videos to a secure cloud storage system. For a duration of 7 consecutive days, participants will record their TikTok usage for 3 minutes each day. Concurrently, they will complete surveys measuring mediators such as their awareness of information silos, intrapersonal variety, and interpersonal differences.
Afterward, we will analyze the recorded videos. We will attempt both human coders and algorithms to label content topics and assess the presence of information silos. The human coders will be hired from the university's lab. Due to the scale of the experiment, we will consider deploying a large language model, in particular, GPT-4 \citep[see][for a technical review]{achiam2023gpt} with DSPy fine-tuning \citep{khattab2023dspy} and prompt engineering \citep[see, e.g.,][for example cases]{wang2023review} to label video recordings in parallel. From these recordings, we will compute metrics discussed in Section \ref{subsec:index}: intrapersonal variety, which reflects the diversity of content consumed by a user; interpersonal differences, representing the variation in content consumption among different users; and information silo index, a measure incorporating the aforementioned metrics. These indices will then be correlated with demographics to identify potential patterns and trends.
Stage 2 will commence seven days after the initial data collection, introducing treatment interventions and further exploration of mediators related to information silos.
Alternative to the three information silo measures, for robustness checks, we will collect survey responses related to intrapersonal variety and interpersonal differences respectively. For intrapersonal variety, the following statements will be rated:
\begin{enumerate}
\item I tend to explore a wide range of topics when browsing social media.
\item I enjoy encountering new perspectives and ideas on social media.
\item When encountering content on social media that challenges my beliefs or opinions, I often seek more information from other sources.
\item When browsing social media, I often explore topics that are outside of my usual interests.
\end{enumerate}
For interpersonal differences, the following statements will be rated:
\begin{enumerate}
\item I frequently come across content or opinions on social media that are different from mine.
\item I tend to agree with others when encountering different viewpoints or opinions on social media that significantly differ from mine.
\item I enjoy reading different viewpoints or opinions on social media that significantly differ from mine.
\item My friends/family members and I often encounter different topics or opinions on social media.
\end{enumerate}
At the end of this stage, we will test several mediators to understand the psychological impacts of information silos by asking the participants to rate the following statements:
Mediator 1: Information overload
\begin{enumerate}
\item It is easier for me to process information that is consistent with my existing beliefs.
\item Information that is not consistent with my existing beliefs makes me feel overloaded.
\end{enumerate}
Mediator 2: Cognitive cost
\begin{enumerate}
\item Incurs cognitive conflict and psychological discomfort.
\item Discomfort with information that conflicts with existing beliefs.
\end{enumerate}
Mediator 3: Self Identity
\begin{enumerate}
\item Strengthens my group identity.
\item Preference for information consistent with my social identity.
\end{enumerate}
Subsequently, participants will be randomly assigned to one of three treatment groups, alongside a control group. We will implement an information treatment \citep[\`a la][]{han2023interest}, providing accurate insights into each consumer's browsing behavior to intentionally disrupt their information silos. Participants in the treatment groups will receive daily messages tailored to their treatment status, while the control group will receive a reminder to upload their video, serving as a placebo.
\begin{enumerate}
\item Baseline information treatment: ``Out of 14 video topics, you have covered \# topics."
\item Information treatment plus scientific evidence: ``Out of 14 video topics, you have covered \# topics. Scientific research has found that it’s beneficial to be exposed to broader sources of information."
\item Information treatment plus peer comparison: ``Out of 14 video topics, you have covered \# topics. \#\% of participants viewed a larger variety of topics than you."
\end{enumerate}
The statistics in the peer comparison treatment are calculated from the percentile in the entire sample's $\IS$ index distribution.
Similar to Stage 1, participants will again receive a guide to record their TikTok browsing behavior for an additional 3 minutes per day over the next 7 days, with daily reminders to encourage compliance. The videos recorded during this stage will also be labeled for content analysis. We will compute the same indices as in Stage 1, including intrapersonal variety, interpersonal differences, and the information silo index. Additionally, we will calculate the Conditional Average Treatment Effect (CATE) to assess the impact of different treatments on participants' awareness and behavioral responses.
Upon completing the entire study, each participant will receive compensation of US\$ 10. This incentive is designed to encourage full participation and ensure a high completion rate across all stages of the research.
From the power calculation based on the pilot study, our target sample size is 400 participants per treatment arm to detect a mean information silo index of 0.35, with a minimum detectable effect of 0.02 and a standard deviation of 0.1. Anticipating a survey yield rate of 30\% from our pilot study, we aim to distribute invitations to 2,500 individuals on Prolific.