Last registered on July 30, 2019

Trial Information

Name

Affiliation

University of Pennsylvania

Status

Completed

Start date

2019-05-06

End date

2019-07-15

Keywords

Additional Keywords

JEL code(s)

Secondary IDs

Abstract

The trial will compare three perspectives on guiding the search for households infested with Triatoma infestans, a vector of Chagas disease.

Arm 1. Raw Data-- raw data (historical infestation, participation in insecticide application campaigns) will be presented to searchers.

Arm 2. Risk Map (Vectorpoint) and incentives for spatial coverage and visit to higher risk households.

Arm 3. Optimization -- households are identified, including alternatives, and assigned to searchers

Arm 1. Raw Data-- raw data (historical infestation, participation in insecticide application campaigns) will be presented to searchers.

Arm 2. Risk Map (Vectorpoint) and incentives for spatial coverage and visit to higher risk households.

Arm 3. Optimization -- households are identified, including alternatives, and assigned to searchers

External Link(s)

Citation

Levy, Michael. 2019. "A 'Poker Trial' of alternative approaches to search a city for Triatomine vectors of Chagas Disease." AEA RCT Registry. July 30. https://doi.org/10.1257/rct.4127-2.0.

Former Citation

Levy, Michael. 2019. "A 'Poker Trial' of alternative approaches to search a city for Triatomine vectors of Chagas Disease." AEA RCT Registry. July 30. http://www.socialscienceregistry.org/trials/4127/history/50880.

Experimental Details

Intervention(s)

Three approaches to spatial search for triatomine bugs are compared:

I. Vectorpoint: an app that shows a risk map, based on a spatial temporal INLA model. Risk levels are presented as quintiles. Spatial coverage of the search zone is measured using delauney triangulation, in which each inspected household forms a node of triangles. The largest triangle (in terms of number of un-inspected households within the triangle) is our measure of coverage.

II. Control: An app which shows historical data on households, including prior infestation status and participation in the 'attack' phase of the vector control campaign.

III. Optimization: an app which assigns inspectors to households sequentially. The assigned household is shown in a blue dot. The optimization algorithm behind this application balances spatial coverage and visits to higher-risk households (as defined via the same model behind vectorpoint)

I. Vectorpoint: an app that shows a risk map, based on a spatial temporal INLA model. Risk levels are presented as quintiles. Spatial coverage of the search zone is measured using delauney triangulation, in which each inspected household forms a node of triangles. The largest triangle (in terms of number of un-inspected households within the triangle) is our measure of coverage.

II. Control: An app which shows historical data on households, including prior infestation status and participation in the 'attack' phase of the vector control campaign.

III. Optimization: an app which assigns inspectors to households sequentially. The assigned household is shown in a blue dot. The optimization algorithm behind this application balances spatial coverage and visits to higher-risk households (as defined via the same model behind vectorpoint)

Intervention Start Date

2019-05-06

Intervention End Date

2019-07-15

Primary Outcomes (end points)

We are running a 'Poker Trial' in which arms compete directly with the control on a set of paired intervention areas. The 'hands' are valued as follows:

Infestation: If an inspector uncovers an infested household. Base points = 500 with an additional 50 points for each additional infested house discovered

Risk & Coverage: If the average risk of searched households >=4 (of a 5-point scale) and the spatial coverage is less than or equal to 5% of the total number of households. Base points = 300 additional points: 5 per reduction of 1 house in the largest triangle, 1,2,3,4,5 for visits to houses in risk quintiles 1-5 respectively.

Risk alone: Mean risk of households visited >=4, but spatial coverage is not less than 5% of the total number of houses. Base points= 100, additional points 1,2,3,4,5 for each risk level.

Spatial coverage alone: Largest delauney triangle is <5% of total number of houses. Base points = 100, 5 additional for each reduction in size of the largest triangle.

Infestation: If an inspector uncovers an infested household. Base points = 500 with an additional 50 points for each additional infested house discovered

Risk & Coverage: If the average risk of searched households >=4 (of a 5-point scale) and the spatial coverage is less than or equal to 5% of the total number of households. Base points = 300 additional points: 5 per reduction of 1 house in the largest triangle, 1,2,3,4,5 for visits to houses in risk quintiles 1-5 respectively.

Risk alone: Mean risk of households visited >=4, but spatial coverage is not less than 5% of the total number of houses. Base points= 100, additional points 1,2,3,4,5 for each risk level.

Spatial coverage alone: Largest delauney triangle is <5% of total number of houses. Base points = 100, 5 additional for each reduction in size of the largest triangle.

Primary Outcomes (explanation)

Secondary Outcomes (end points)

Difference in the average estimated probability of infestation of households before and after inspections.

Secondary Outcomes (explanation)

The spatial-temporal model predicts the probability of infestation for each house, conditional on covariates (prior infestation, prior participation in insecticide campaigns), spatial position (a spatially correlated random effect estimated through INLA) and an intercept.

Difference in Difference Analysis of the estimated mean probability of infestation.

The average probability of infestation for an area is estimated by a spatiotemporal statistical model [Gutfraind & Peterson et al.] (https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0006883) using the history of infestation in the area. We obtain the difference in the average probability for each area as follows:

Difference in average probability of infestation = average probability of infestation before inspection - average probability of infestation after inspection

We consider five potential covariates::

1. The number of houses in the search zone

- An integer value

2. The number of houses that were previously infested during the vector control campaign conducted by the Arequipa Regional Ministry of Health, including the insecticide application (attack) phase and the post-treatment surveillance phase

- An integer value

3. The percent of houses that were not previously sprayed with insecticide during the vector-control spray campaign conducted by the Arequipa Regional Ministry of Health

- A decimal value between 0 and 1

4. The area of the convex hull enclosing the search zone

- A real number value, in meters

- This was estimated using the “chull” command that is part of the grDevices package in R

5. The index of dissimilarity [White 1986] (https://www.jstor.org/stable/pdf/3644339.pdf) in the search zone. This measures the clustering of the higher risk houses (within the top two quantiles, which are shown as red and dark red in the vectorpoint application). The index is potentially useful because in areas in which the index of dissimilarity is low, and high probability houses are near to each other, it may be easier to decrease the average probability of infestation due to the effect of the spatial autocorrelation term in the model.

- A real number value

- The index of dissimilarity in a given search zone is estimated by first dividing a search zone into 5 areas using a 5-k-means clustering algorithm using the “kmeans” function that is part of the stats package in R. We will then calculate the index of dissimilarity using the following formula adapted from [White 1986] (https://www.jstor.org/stable/pdf/3644339.pdf): D = ½ summation from i=1 to i=n of the absolute value of h_i / H_T - l_i / L_T, where i is the number of each cluster identified by the 5-k-means clustering algorithm, l_i is the number of houses within the two highest quantiles of probabilities of household infestation in cluster i, H_T is the total number of houses within the two highest quantiles of probabilities of household infestation in the entire search zone, l_i is the number of houses within the lowest 3 quantiles of probabilities of household infestation in cluster i, L_T is the total number of houses with the lowest 60% probabilities of household infestation in the entire search zone.

We will conduct univariate analyses testing for associations between each variable and the difference in the mean probability of infestation before and after the search. Each of these tests will be done using the “glm” function of the stats package in R. Those variables found to have an association with a p-value of <.2 will be kept for consideration in the full difference-in-difference model.

Any zone in which infestation is detected during the trial will be excluded from the difference in difference analyse (because the mean probability of infestation will increase in that zone due to the new information).

Difference in Difference Analysis of the estimated mean probability of infestation.

The average probability of infestation for an area is estimated by a spatiotemporal statistical model [Gutfraind & Peterson et al.] (https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0006883) using the history of infestation in the area. We obtain the difference in the average probability for each area as follows:

Difference in average probability of infestation = average probability of infestation before inspection - average probability of infestation after inspection

We consider five potential covariates::

1. The number of houses in the search zone

- An integer value

2. The number of houses that were previously infested during the vector control campaign conducted by the Arequipa Regional Ministry of Health, including the insecticide application (attack) phase and the post-treatment surveillance phase

- An integer value

3. The percent of houses that were not previously sprayed with insecticide during the vector-control spray campaign conducted by the Arequipa Regional Ministry of Health

- A decimal value between 0 and 1

4. The area of the convex hull enclosing the search zone

- A real number value, in meters

- This was estimated using the “chull” command that is part of the grDevices package in R

5. The index of dissimilarity [White 1986] (https://www.jstor.org/stable/pdf/3644339.pdf) in the search zone. This measures the clustering of the higher risk houses (within the top two quantiles, which are shown as red and dark red in the vectorpoint application). The index is potentially useful because in areas in which the index of dissimilarity is low, and high probability houses are near to each other, it may be easier to decrease the average probability of infestation due to the effect of the spatial autocorrelation term in the model.

- A real number value

- The index of dissimilarity in a given search zone is estimated by first dividing a search zone into 5 areas using a 5-k-means clustering algorithm using the “kmeans” function that is part of the stats package in R. We will then calculate the index of dissimilarity using the following formula adapted from [White 1986] (https://www.jstor.org/stable/pdf/3644339.pdf): D = ½ summation from i=1 to i=n of the absolute value of h_i / H_T - l_i / L_T, where i is the number of each cluster identified by the 5-k-means clustering algorithm, l_i is the number of houses within the two highest quantiles of probabilities of household infestation in cluster i, H_T is the total number of houses within the two highest quantiles of probabilities of household infestation in the entire search zone, l_i is the number of houses within the lowest 3 quantiles of probabilities of household infestation in cluster i, L_T is the total number of houses with the lowest 60% probabilities of household infestation in the entire search zone.

We will conduct univariate analyses testing for associations between each variable and the difference in the mean probability of infestation before and after the search. Each of these tests will be done using the “glm” function of the stats package in R. Those variables found to have an association with a p-value of <.2 will be kept for consideration in the full difference-in-difference model.

Any zone in which infestation is detected during the trial will be excluded from the difference in difference analyse (because the mean probability of infestation will increase in that zone due to the new information).

Experimental Design

Inspectors have all previously been trained in entomological search in Arequipa and the use of the three apps. Each is allowed to search a search zone for one week (6 hours/day, 30 hours/week). In the Vectorpoint arm they are provided points (as described above) which can be used for paid time off (1000 points = a day off). In the control and optimization apps they receive points only for identifying infested households. Personnel are also instructed to assess infestation with bed bugs and are provide 50 points for each confirmed identification.

Searchers are interviewed once under each arm at the end of the week to record their search strategies under each approach.

All infestations are confirmed by the field manager and reported immediately to the Ministry of Health for treatment in the case of triatomine infestations.

Searchers are interviewed once under each arm at the end of the week to record their search strategies under each approach.

All infestations are confirmed by the field manager and reported immediately to the Ministry of Health for treatment in the case of triatomine infestations.

Experimental Design Details

Randomization Method

Search areas were matched (into triplets) on two covariates: number of previously infested houses in the zone and district in which the zone lies. Optimal tripartite matching algorithms were run in r.

Randomization Unit

Search zones consisted of approximately 250 (210-280) households. These were defined, blinded to arms, based on political and geophysical barriers in the landscape.

Was the treatment clustered?

Yes

Sample size: planned number of clusters

9 searchers

Sample size: planned number of observations

54 search zones. Observations are on the level of search zone.

Sample size (or number of clusters) by treatment arms

18

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

For the poker trial we are comparing simply which arm wins (although we try 3 interventions together we compare them individually). The outcome is a simple 0/1 for each 'hand', if an intervention beat the control at least 13 out of 18 hands the p-value will be <.05. Considering multiple comparisons an arm would have to beat any other arm at least 13 times (p=0.015).

IRB

INSTITUTIONAL REVIEW BOARDS (IRBs)

IRB Name

University o Pennsylvania

IRB Approval Date

2019-04-24

IRB Approval Number

824603

IRB Name

Universidad Peruana Cayetano Heredia

IRB Approval Date

2018-06-26

IRB Approval Number

66427

Post Trial Information

Is the intervention completed?

No

Is data collection complete?

Data Publication

Is public data available?

No

Program Files