Harnessing social media influencers or using X ads to combat misinformation?

In partnership with the fact-checker Africa Check, we assess whether social media influencers (SMIs) can be effectively recruited to increase awareness of, and knowledge about, misinformation among their social media audiences. Our partnering fact-checker recruited ``micro'' SMIs---individuals, such as journalists and social activists, with significant followings---on Twitter in Kenya and South Africa. We then provided a random half of the SMIs with digital literacy training videos, as well as contemporaneous fact-checks, for them to share with their audiences through social media. We additionally cross-randomized the dissemination of identical content to their audiences via regular Twitter ads to compare whether SMIs are more effective than ads in ensuring that digital literacy training and fact checks reach and are internalized by their followers. This at-scale intervention ultimately seeks to understand how fact-checking institutions can effectively combat misinformation.

External Link(s)

Registration Citation

Citation

Bandiera, Antonella et al. 2024. "Harnessing social media influencers or using X ads to combat misinformation?." AEA RCT Registry. August 06. https://doi.org/10.1257/rct.14139-1.0

Sponsors & Partners

Partner

Name

Africa Check

Type

ngo

URL

https://africacheck.org/

Experimental Details

Interventions

Intervention(s)

In partnership with Africa Check, the first and largest fact-checking organization in sub-Saharan Africa, we ask whether ``micro'' social media influencers (SMIs) can be recruited to increase awareness and knowledge of misinformation among their online audiences by posting digital literacy training materials and fact-checks. The effectiveness of fact-checking interventions relies heavily on users' trust in and attention to a source. Consequently, micro SMIs with engaged audiences may be an especially effective mode of delivery to target large numbers of individuals at minimal cost. However, this raises a critical question: can such SMIs be incentivized to disseminate information about misinformation? And will their followers engage with it? While piloting indicated that financial incentives for posting digital literacy training materials and fact-checks are required for compliance (in the form of compensating the fact-checkers for internet data costs and time), their effectiveness is unclear. To benchmark the effectiveness of leveraging SMIs to disseminate digital literacy training materials and fact-checks, we additionally use an equivalent amount of funds to instead target ads to social media followers. Our goal is to inform the programming decisions that fact-checkers face relating to how to optimally mitigate misinformation's potentially deleterious effects.

Intervention (Hidden)

Intervention Start Date

2023-03-13

Intervention End Date

2024-06-25

Primary Outcomes

Primary Outcomes (end points)

Our outcomes are organized into six subcategories: activity, verifiability, fake/true, topic sentiment, URLs, and interactions. For each subcategory, we define and, in some cases, create various outcome variables that enable us to assess the impact of the intervention on users' overall behavior.

Primary Outcomes (explanation)

Social media activity: At the sender level, we measure the extent to which SMIs made posts with the treatment information, and the impressions and engagement their posts received. We similarly measure the impressions and engagement of Twitter ad posts with the treatment information received.

At the follower level, we measure the extent to which SMI followers posted, shared, replied to, and quoted others' posts. We refer to these forms of engagement collectively as ``posts'', since we are unable to distinguish between them for the subset of data scraped through traditional techniques following Twitter API's price changes.

Verifiable content: We trained a machine learning model to label followers' posts as verifiable or non-verifiable. Thanks to Africa Check's help labeling a sample of posts, we trained a binary classification model based on the Bidirectional Encoder Representations from Transformers (BERT) architecture. This model labeled posts as either verifiable or non-verifiable, and had a validation accuracy of 85%. Beyond being of interest itself, by identifying verifiable posts, we could subsequently label verifiable posts as fake or true, as we explain next.

Fake/true content: We developed an additional model to label followers' verifiable posts as either approximating content that is fake or true. This binary classification model was trained using posts labeled as ``fake'' by Africa Check in their fact checks or as ``true'' if they originate from reputable news sources, such as established newspapers. Similar to our verifiable model, we utilized the BERT architecture. Our model had a validation accuracy of 84%.

URL source: Since information and misinformation are often not explicitly written in posts, but shared through links, we analyze the URLs shared by followers. To do this, we first extracted all URLs from the posts and grouped them by domain name, and manually classified them as follows. First, we distinguish between links to information sources (e.g. newspapers, fact checks, blogs, etc.) versus other websites that do not provide information (e.g. gambling websites). Second, for information sources, we distinguish between reliable and non-reliable news websites, fact checks, and other information sources.

To categorize the news websites from Africa, we first leveraged a dataset from Africa Check that classified reliable and non-reliable new websites. Additionally, Africa Check reviewed and classified an extra batch of African news websites. For global news websites, we used several sources, including NewsGuard and MediaBias.

Topic sentiments: We analyze the effect of our intervention on the sentiment expressed by followers towards topics usually the subject of misinformation, such as COVID-19 and COVID-19 vaccines, and politics. We first identified and labeled tweets containing any information related to a given topic. For example, in the case of COVID-19 and COVID-19 vaccines, we manually defined a set of keywords, such as ``coronavirus'', ``COVID-19'', ``mRNA'', ``Pfizer'', etc., and filtered all posts containing these terms. From the subset of posts containing these keywords, we extracted the most frequent words (excluding the initial set of keywords) to ensure that no relevant terms related to the topic were excluded. We then identify the set of posts containing any of these terms.

We employ two different sentiment analysis models to predict the sentiment of each post as positive, negative, or neutral. Our primary analysis utilizes a BERT model, trained on a comprehensive dataset sourced from https://www.kaggle.com/datasets/datatattle/covid-19-nlp-text-classification, to perform sentiment prediction on the tweets. Additionally, for robustness, we consider https://vadersentiment.readthedocs.io/en/latest/, which is a lexicon and rule-based sentiment analysis tool specifically designed to detect sentiments expressed on social media.

Social interactions with follower content: We use data on the social interactions with the posts produced by followers to describe the popularity of each post that their own followers interacted with. We focus exclusively on original tweets, not retweets, as the API provides interaction metrics for the initial tweet in the case of retweets.
To conduct this analysis, we computed total interactions (likes, shares, comments, and retweets) with each follower's posts, as well as those specific to different types of posts (such as verifiable and non-verifiable posts) and posts with different sentiments about various topics (such as about COVID-19 and the COVID-19 vaccines).

Secondary Outcomes

Secondary Outcomes (end points)

Secondary Outcomes (explanation)

Experimental Design

Followers of SMIs participating in the study were, at random, either (1) treated with information from one or more SMIs they followed, (2) treated with information from ads, (3) treated with both, or (4) assigned to receive no information as part of this intervention.

Following the recruitment and randomization of SMIs and followers, the intervention then proceeded as follows. Over the course of eight weeks, treated SMIs in each phase of the intervention received materials from Africa Check twice a week and were encouraged to tweet about these materials to their followers. Africa Check used Twitter's ad targeting tool around the same days that treated SMIs to promote the same materials to treated followers. The research team tracked selected SMI followers' social media activity throughout the eight weeks of treatment and four weeks after.

Experimental Design Details

Randomization Method

In office by a computer.

Randomization Unit

Social media influencers and their followers.

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

Not applicable.

Sample size: planned number of observations

Around 200,000 followers of 262 social media influencers.

Sample size (or number of clusters) by treatment arms

The 200,000 followers are split roughly equally across the four treatment arms.

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

Supporting Documents and Materials

IRB

Institutional Review Boards (IRBs)

IRB Name

Columbia University Institutional Review Board

IRB Approval Date

2023-01-27

IRB Approval Number

IRB-AAAU4365

Analysis Plan

Analysis Plan Documents

Pre-analysis plan

MD5: 76b538702ec5c021334015d9f9005827

SHA1: a8f07d1a5507b455f69e9c570411fda331b765be

Uploaded At: August 05, 2024

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?

Data Collection Complete

Data Publication

Is public data available?

Program Files

Reports, Papers & Other Materials

Harnessing social media influencers or using X ads to combat misinformation?

Pre-Trial

General Information

Locations

Primary Investigator

Other Primary Investigator(s)

Additional Trial Information