Experimental Design
We randomize all participants into treatment groups with varying relative performance feedback with respect to the number of steps they walked over the last few weeks. Relative performance feedback is determined as follows:
1. We look at the last seven days and add up the corresponding step counts, thus we avoid weekday effects. With this rolling window approach, we then get a smoothed version for each day compared to just looking at that day alone. We explain this to people with, "Every day we create a ranking based on the steps of the last 7 days".
2. If we do this over 14 days and rank each day, then there is a ranking value for each subject every day. In the unadorned version (RTF), we give people the value they had most often during the 14 days ("Your most frequent ranking during the last 14 days was..."), in the embellished one they get the best of these 14 values ("Your top rating in the last 14 days was...").
3. Feedback is provided in form of a table with ten percentage clusters and a classification of one’s own actual relative performance. Afterwards, the performance (daily number of steps) is further surveyed and compared with previous performance.
The relative performance feedback is varied in three ways:
• We vary between subjects whether their relative performance feedback is based on their most frequent ranking score over the last 2 weeks (RPF) or on their highest achievement (PRPF). Obviously, PRPF gives participants more positive feedback at the outcome level than RPF does in many cases. At the procedural level, PRPF tells people what they have already achieved at least once, that is, what they can accomplish. With RPF, on the other hand, the description that is most applicable to that person is sent.
• Furthermore, relative performance feedback is either given in comparison with all other participants (RPF-all/PRPF-all) or in comparison with the 50 participants closest in the age distribution (RPF-age/PRPF-age). With this approach, we generate a probably more relevant reference group for those participants.
• Last, we vary whether subjects are informed how their rank has been determined (RPF/PRPF) or not (RPFno/PRPFno). This allows us to isolate the cause of a possible effect: is it the ranking itself or does it also play a role in the subjects' evaluation and for a possible change in behavior how the ranking came about.
Participants are randomly distributed into different treatment groups (n ≈ 100 participants/group). We designed the following 9 groups, based on a mixed between-/within-subject design. The intervention phase takes 4 weeks with 2 interventions taking place at t1 (intervention start) and t2 (t1 + 2 weeks).
[Group 1] t1: no feedback; t2: no feedback
[Group 2] t1: RPF-all; t2: RPF-age
[Group 3] t1: RPF-age; t2: RPF-all
[Group 4] t1: PRPF-all; t2: PRPF-age
[Group 5] t1: PRPF-age; t2: PRPF-all
[Group 6] t1: RPFno-all; t2: RPFno-age
[Group 7] t1: RPFno-age; t2: RPFno-all
[Group 8] t1: PRPFno-all; t2: PRPFno-age
[Group 9] t1: PRPFno-age; t2: PRPFno-all
We expect e.g. that absolute performance increases most for participants who receive a positive feedback. By systematically shifting relative performance feedback in group 2 into a more positive direction (even for relatively low performers), we expect the highest performance increase in this group. We also think that comparing participants with those of similar age has a performance enhancing effect, because a relevant descriptive norm is created, which participants want to live up to. Participants know that the overall group of participants is very heterogeneous in terms of gender, age, prior experience, time restrictions, etc., and we hypothesize that there is a difference between providing relative performance feedback about all participants and providing feedback with respect to only those who are (approximately) the same age. To clearly disentangle whether a positive feedback per se drives the results or only in interaction with the allocation rule, we compare performance of group 2-5 with those of groups 6-9.
Our setting also allows us to take an exploratory approach, as we gathered an exceptional volume of additional behavioral data, which is seldom in this field of work. In our subgroup analysis, we expect insights on influences of gender, age, competitiveness, social attributes, etc. in this context while we take care of multiple testing biases.