Experimental Design
Also see attachment:
I. EXPERIMENTAL DESIGN
In our experiment, subjects are exposed to multiple hypothetical persons, one at a time. Each person makes either 1 statement (Block-type 1) or 10 statements (Block-type 10 or Block--type 9+1). Subjects are asked to estimate the probability for 1 statement per person about its truthfulness. In Block-type 1, subjects estimate the probability for the only statement that is shown to them. In Block-type 10 and Block-type 9+1, subjects estimate the 10th statement of the 10 statements that are shown to them. In total, subjects face 10 hypothetical persons of Block-type 1, 20 hypothetical persons of Block-type 10, and 10 hypothetical persons of Block-type 9+1. While hypothetical persons are grouped according to their block-type, we randomize the order with which subjects are exposed to the block-types.
Subjects know that each hypothetical person can be of a reliable or unreliable type. Each statement of a reliable person is true with 80% and false with 20%. Each statement of an unreliable person is false with 80% and true with 20%. Each hypothetical person has the same chance of being reliable or unreliable and types are not communicated to subjects. Thus, we induce a 50-50 prior regarding the truthfulness of a statement. To maximize control about this baseline truthfulness of each statement, statements are abstract in the sense that they are without any content in our setup.
When subjects are exposed to the hypothetical persons and their statements, they also know that each statement may or may not have been checked by a fact checker with a known probability (as described below). While the fact checker labels each statement that she checked and determined to be false with the label “false”, she does NOT label statements that she checked and determined to be true. Subjects know that the fact checker determines the truthfulness of a statement without error.
This setup has three implications: (i) Statements with a label are definitely not true. Statements without a label either (ii) may be true or false as they were not checked by the fact checker or (iii) are true as the fact checker determined the statement to be true.
II. TREATMENTS
Our experiment consists of 2 between-subjects treatments.
Treatment LO:
The statement in Block-type 1 and each statement in Block-type 10 is checked by the fact checker with a probability of p=0.25. Each one of the first 9 statements in Block-type 9+1 is checked by the fact checker with a probability of p=0.25. The 10th statement in Block-type 9+1 is definitely not checked by fact checker.
Treatment HI:
The statement in Block-type 1 and each statement in Block-type 10 is checked by the fact checker with a probability of p=0.75. Each one of the first 9 statements in Block-type 9+1 is checked by the fact checker with a probability of p=0.75. The 10th statement in Block-type 9+1 is definitely not checked by fact checker.
III. OUTCOMES
see PRIMARY OUTCOMES
IV. HYPOTHESES AND TESTS
We designed our experiment to test 4 predictions:
1. Implied truth
In Block-type 1, subjects receive only 1 statement per hypothetical person. As non-labeled statements may have been checked by the fact checker and determined to be true but cannot have been checked by the fact checker and determined to be false, Bayesian updating predicts that their truthfulness is greater in Treatment HI than in Treatment LO, i.e., the implied truth effect. We test this prediction by comparing subjects’ average stated truthfulness of non-labeled statements (OV1) in Block-type 1 between Treatment LO and Treatment HI via OLS regressions. As we have potentially multiple observations per subject, we cluster the standard errors at the subject level.
2. Implied reliability
In Block-type 9+1, subjects receive 9 statements that may or may not have been checked by the fact checker (and hence may or may not be labeled) and 1 statement (the 10th statement) that for sure was not checked by fact checker. Subjects are asked to estimate the truthfulness of the 10th statement, that is, the statement that was not checked. Bayesian updating predicts that implied reliability may play a role: the more of the 9 statements are labeled to be false, the greater is the probability that the person is of the unreliable type and, in consequence, the greater is the probability that the non-checked statement is false. Bayesian updating hence predicts, conditional on the number of labels among the 9 other statements, that statements are more likely to be true in Treatment HI than in Treatment LO.
We test this prediction by comparing subjects’ average stated truthfulness of non-labeled statements (OV1) in Block-type 9+1 between Treatment LO and Treatment HI, conditional on the number of labels among the 9 other statements via OLS regressions. As we have potentially multiple observations per subject, we cluster the standard errors at the subject level.
3. Relative relevance of implied truth versus implied reliability
While implied truth may affect belief formation in Block-type 1, it should not affect belief formation in Block-type 9+1 since the statement in question is known to be non-checked. While implied reliability may affect belief formation in Block-type 9+1, it should not affect belief formation in Block-type 1 since there hypothetical persons only make the statement in question (that is, other, potentially labeled, statements are missing). In Block-type 10, however, both implied reliability and implied truth may affect belief formation. Via comparisons of Block-type 10 with Block-type 1 and Block-type 9+1, we seek to investigate the relative importance of implied truth and implied reliability. To do this, we conduct 3 separate types of OLS regressions. In each type of these OLS regression, our outcome variable is OV2 as described above.
For Block-type 1, we regress OV2 on ITE defined by ITE=ln(1/(1-p)).
For Block-type 9+1, we regress OV2 on IRE defined by IRE=SEE ATTACHMENT.
For Block-type 10, we regress OV2 on ITE and IRE.
The estimated coefficients of the explanatory variable ITE estimate how much weight subjects put on the implied truth effect in their belief formation. The Bayesian benchmark is a weight equal to 1, while an estimated weight <1 would indicate underreaction to the implied truth effect and an estimated weight >1 would indicate overreaction to the implied truth effect.
The estimated coefficients of the explanatory variable IRE estimate how much weight subjects put on the implied reliability effect in their belief formation. The Bayesian benchmark is a weight equal to one, while an estimated weight below one would indicate underreaction to the implied reliability effect and an estimated weight above one would indicate overreaction to the implied reliability effect.
By comparing the estimated coefficients for ITE between Block-type 10 and Block-type 1, we investigate whether subjects differentially weigh the implied truth effect in their belief formation in these environments. By comparing the estimated coefficients for IRE between Block-type 10 and Block-type 9+1, we investigate whether subjects differentially weigh the implied reliability effect in their belief formation in these environments. As we have potentially multiple observations per subject, we cluster the standard errors at the subject level in each type of OLS regressions described above.
4. Discernment
How do the implied truth effect and the implied reliability effect affect discernment, which is defined to be the difference between the average belief in true statements and the average belief in false statements and measured by our OV3. In light of no fact checking and prior probabilites of 50-50, discernment should be equal to 0, since subjects have no meaningful way to distinguish between true and false statements. By introducing fact checking, discernment could differ from 0. Under Bayesian updating, discernment is higher in Block-type 10 than in the other Block-types, because the fact checker provides additional information in Block-type 10. We predict that discernment improves if fact checking is more prevalent, that is, we observe greater discernment in Treatment HI than in Treatment LO. We test this prediction by investigating potential treatment differences (Treatment HI vs Treatment LO) separately for each Block-type via OLS regressions. As we have potentially multiple observations per subject, we cluster the standard errors at the subject level in each OLS regression.
5. Heterogeneity
After subjects complete all hypothetical persons of a certain block-type, we elicit subjects’ certainty with their estimates of labeled statements in that block-type. Thus, we elicit subjects’ certainty of their estimates of all non-labeled statements in Block-type 1, subjects’ certainty of all estimates on non-labeled statements in Block-type 10, and subjects’ certainty of all estimates on all statements in Block-type 9+1. We use these certainty measures to conduct heterogeneity analyses.
IV. EXCLUSION CRITERIA
We consider 3 samples. First, we consider all subjects who participated in our experiment. Then, we plan to conduct 2 sets of separate analyses that focus on different subsamples to test if our findings are robust to focusing on subjects who paid attention to our experimental instructions: In our 2nd sample, we consider only subjects who estimate the probability of statements with labels with something >0 in <10% of the cases. In our 3rd sample, we make use of a final hypothetical person that we show subjects after they completed all Block-types. The final hypothetical person makes only 1 statement and subjects are made aware that this statement cannot be checked by the fact checker. The statement is hence true with 50%, which corresponds with the prior probability that we induced. In our 3rd sample, we focus only on subjects who correctly state 50% for the final hypothetical person.