The Value of Investor Data: An Experimental Approach using a Trading Simulation Platform

Last registered on May 30, 2024

View Trial History

Pre-Trial

Trial Information

General Information

Title

The Value of Investor Data: An Experimental Approach using a Trading Simulation Platform

RCT ID

AEARCTR-0013691

Initial registration date

May 27, 2024

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

May 30, 2024, 3:44 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

Region

Country

Region

Primary Investigator

Name

Roxana Mihet

Affiliation

University of Lausanne and Swiss Finance Institute

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Francois Longin

PI Affiliation

ESSEC Business School, France

Contact Investigator

PI Name

Ziwei Zhao

PI Affiliation

University of Lausanne and Swiss Finance Institute

Contact Investigator

Additional Trial Information

Status

On going

Start date

2024-05-01

End date

2024-12-14

Keywords

Behavior, Finance & Microfinance, Firms & Productivity

Additional Keywords

retail investors, data, privacy, payment for order flow, fair compensation, informed investing, data sharing.

JEL code(s)

G0, G4, G5, O0, D0, D8

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

Our project investigates personal data privacy and the monetization of personal data by digital finance platforms, particularly focusing on the value retail investors place on the privacy of their trading activity (i.e., their investor sophistication level, as well as the history of the orders they sent to the market and history of the orders that were executed). We collect this data using experiments on a realistic and sophisticated trading simulation platform we developed ourselves. We collect data on how much individual retail investors care about having their personal trading activity shared with other market players, by conducting a controlled experiment where we observe a privately informed individual’s trading behaviour when made aware (or not) about their personal trading history being shared with other market participants. We assess the implied monetary compensation for an individual’s data privacy by considering any changes in the timing of trades, the size, and the type of the orders (i.e., market or limit orders) between the different information regimes. We also look at the trading profits obtained by individual investors. In surveys that follow the completion of the experiment, we also collect standard demographics data (age, gender, education, etc.), data on attitudes towards risk and experience with stock market trading, as well as data on the subjective monetary value they would require as compensation for complete privacy on digital platforms. The data we collect is very rich and allows exploration of data valuations across demographic groups, market conditions, and availability of private information, contributing to the understanding of privacy's true value and the impact of different information treatments.

External Link(s)

Registration Citation

Citation

Longin, Francois , Roxana Mihet and Ziwei Zhao. 2024. "The Value of Investor Data: An Experimental Approach using a Trading Simulation Platform." AEA RCT Registry. May 30. https://doi.org/10.1257/rct.13691-1.0

Sponsors & Partners

Interventions

Intervention(s)

In the digital landscape, the adage "if you’re not paying for a product, then you are the product" rings true. This is particularly evident in daily-use services like Google, YouTube, Facebook, Twitter, and others, which monetize by providing personal data to third parties. However, this practice is not limited to social networks. It extends to financial firms as well. For instance, trading platforms, such as Robinhood, offer zero brokerage fees by selling customer data to other financial institutions via 'payment for order flow' (PFOF). Despite concerns about potential issues such as front-running and back-running, this practice has been legally ongoing in the United States for years.

With the rising prominence of data privacy rights and the enforcement of consumer data regulations around the world (i.e., the General Data Protection Regulation (GDPR) in the European Union in 2016, the California Consumer and Protection Act in the United States in 2020, and the Personal Information Protection Law of the People's Republic of China in 2021), there's a growing call to empower individuals with more control over their data usage and potentially compensate them for their data. Understanding how much individuals value their data becomes crucial to fairly compensate investors, especially as zero brokerage fees may underestimate the genuine worth of investor data. Balancing the advantages of information disclosure with privacy concerns necessitates quantifying both aspects using a unified metric for better decision-making in optimizing information sharing processes.

The scientific objectives of this research project revolve around investigating the value individuals place on their personal trading data in financial markets, particularly in the context of information sharing among market participants.

Intervention (Hidden)

Intervention Start Date

2024-05-01

Intervention End Date

2024-12-14

Primary Outcomes

Primary Outcomes (end points)

The project aims to achieve the following key objectives:

1. Quantifying Individual Concerns for Data Privacy: The research will measure the extent to which an individual investor cares about the sharing of their personal trading activity with other players in the financial market. This will be accomplished through a controlled experiment where a participant's trading behavior will be observed under scenarios where they are informed (or not) about the disclosure of their trading history to other market participants.

2. Evaluating Monetary Compensation for Data Privacy: The project will assess the implied monetary compensation associated with an individual's data privacy. We will analyze changes in trading behavior, such as alterations in trade timing, order size, and order type, between different information disclosure settings. Thus, we will infer the financial worth an individual investor places on maintaining their data privacy.

3. Analyzing Trading Profits of Individual Investors: The research will examine the trading profits obtained by individual investors within the context of different information-sharing regimes. Understanding how trading behaviors and profitability correlate with information disclosure can provide valuable insights into the potential impacts and implications of data sharing on investors' financial gains.

Primary Outcomes (explanation)

By undertaking these objectives, our project will provide empirical evidence and insights into the perceived value of data privacy among investors in financial markets. This understanding is crucial, especially amid ongoing discussions about fair compensation for user data and balancing the benefits of information disclosure, such as zero-brokerage fees, with the associated privacy risks. Our findings will contribute to optimizing information disclosure processes and provide a basis for estimating the utility gains associated with transactions involving personal data in financial settings.

Secondary Outcomes

Secondary Outcomes (end points)

The COVID-19 pandemic has witnessed a surge in female engagement with the stock market, attributed to job and savings insecurities during the pandemic, and increased usage of no-fee trading platforms facilitated by the time spent at home. For example, Robinhood saw a 369% increase in the number of women using its services during the pandemic and women now make up 30% of its customer base.

Despite this, a substantial body of academic literature underscores the heightened vulnerability of women within the stock market due to a historical gender disparity in financial literacy and participation. Women typically exhibit lower financial education levels and reduced involvement in investment activities (Guiso and Zaccaria, 2021 ), often remaining underbanked (Demirgurc ̧-Kunt et al., 2017 ), and exhibiting lower stock market participation levels (Ke, 2021 ). Furthermore, women frequently express higher privacy concerns while exhibiting limited adoption of protective behaviors in comparison to men (Armantier et al., 2021 , Hoy and Milne, 2010 , Sheehan and Hoy, 2000 ).

The prevailing digital landscape, characterized by the monetization of personal data by numerous online platforms, is notably relevant in the financial sector, whereby trading platforms such as Robinhood leverage customer data monetization as a means to offer commission-free services to individual investors. However, this data sharing practice raises concerns regarding investor vulnerability to welfare losses (Acquisti et al., 2015 ), especially concerning retail investors, including women, who may inadvertently consent to data sharing without a comprehensive understanding of its potential adverse impact on their trading outcomes.

Consequently, this study emphasizes the necessity to investigate whether women modify their risk attitudes and trading approaches in response to the awareness of their trading data being shared. The aim is to provide insights that guide policy-makers in tailoring adaptive measures that protect privacy, considering gender dynamics.

Secondary Outcomes (explanation)

In our research, we extend beyond conventional survey-based approaches evaluating women's privacy preferences within the stock market, a prevailing practice in the existing literature. Traditional methods for assessing privacy value, like self-reported questionnaires, often lack direct relevance to real-world behavioral patterns. Our approach involves an experimental design rooted in revealed preference theory, seeking to comprehend shifts in an investor's trading behavior upon understanding the potential sharing of their past trading history with other market participants. Crucially, we meticulously elucidate the mechanisms through which trading data may be shared and its implications for investors. By doing so, we aim to offer an unbiased assessment of the value attributed to privacy for investors.

Moreover, we utilize this experiment to infer the monetary value of investor data by analyzing realized profits before and after the information treatment across diverse scenarios (totalling 60).

Solely considering zero-brokerage fees or the price improvement resulting from Payment for Order Flow (PFOF) provides an inadequate estimation of the genuine worth of investor data. Through this project, we directly quantify the value of investors' historical trading data by observing their actual trading behavior pre- and post-information treatment, offering a comprehensive understanding of the value of privacy, in particular but not only women’s privacy, in the context of investor data.

Experimental Design

The first step involves developing a trading algorithm on a fully functioning trading simulation platform developed at ESSEC Business School, that provides direct access (“online”) trade execution and clearing services.

The algorithm will take into account three scenarios: (1) one in which retail investors are aware their trading data is shared with other market participants, (2) one in which they are aware the data is not shared and (one where nothing is mentioned regarding data-sharing, across three settings: (a) one in which participants receive good private information, (b) bad private information, or (c) no private information.

Thus, the experiment itself will have multiple versions: the control group will receive good or bad private information about the asset traded but will not be told anything about the functioning of their account. The treated group will receive good or bad private information and then be told that their account trading history will start being shared with other participants.

Demographic data collected will include the participants’ profile information (age, gender, experience with the stock market, etc.) and the experimental data will consist of participants’ trading activity during the trading simulations. More precisely, the trading activity consists of the orders (limit and market orders) sent by the participants during the simulation and the evolution of their position (cash and assets and profits) during the simulation.

The goal is to record how individuals react once they are made aware that their account trading data will be shared with others. This is virtually impossible to do in a non-experimental setting.

Experimental Design Details

As previously explained, our project aims to achieve the following three key objectives:
1. Quantifying Individual Concerns for Data Privacy
2. Evaluating Monetary Compensation for Data Privacy
3. Analyzing Trading Profits of Individual Investors
We explain below how we study the value of individual investors’ trading data, and measure the welfare effect of data value, which has never been addressed in the literature before.
We explore the value of investors’ data by conducting a lab experiment. Our experiment is based on a trading simulation platform that mimics the environment of practitioners in financial markets, which enables us to contextualize participants’ trading decisions. Critically, the experiment is designed to study the change in the functioning of the account of the participant regarding the sharing of his/her trading activity. Furthermore, this approach allows us to precisely map a participant to his or her trading activity, which is key to empirically testing the change in trading behavior and estimating the value of investor’s data.
The main benefit of using a lab experiment is the ability to observe trading decisions at the individual level. Individual-level data allow us to precisely map information about the participants and their trading activity to the change in the functioning of the account (data sharing or not). Doing so is particularly relevant for testing our hypotheses and controlling for various factors.
The choice of a lab experiment also overcomes two main challenges of empirical studies: the lack of real data and, more critically, the difficulty of making causal inferences. Regarding the lack of data, first of all, we could not identify a natural experiment for this research question: there are no laws that we could use where trading platforms were exogenously prevented from sharing their customers’ past trading history and provide their customers with a compensation for their data. Second, surveys of investors’ preferences for privacy (see Acquisti et al. 2021) are notoriously controversial because of the “privacy paradox”: while users claim to be very concerned about their privacy, they nevertheless undertake very little to protect their personal data. In terms of causal inference, archival data would make it difficult to study the sharing of information because of the many factors occurring in financial markets. In contrast, with an experimental approach, we can expose participants to a change in data sharing which is the only manipulated variable. In addition, empirical analysis could not separate the relative roles of investors’ biases from those of their endowments and available information in accounting for their trading reaction to the change in sharing financial data. To address this issue, our lab experiment relies on a trading simulation in which each participant is endowed with the same initial portfolio (composed of stocks and cash) and faces the same news flow regarding the company. This setting allows us to isolate the role of potential confounding factors other than those related to individual preferences.
What is more, it is extremely difficult in the real world to cleanly identify whether investors are aware that their data are being “sold” to others. Trading platforms don’t explicitly explain the practice of PFOF and how investor orders are executed. As a matter of fact, most platforms provide very vague information regarding how they use investor data.
For example, they mostly do not share how they use the current order data, historical trading record data of investors. Investors remain largely uninformed regarding who’re their trading counterparties, and how their data can be used. All these factors are very important for investor expectations. Investors might consider the possibility of being front-run or back-run, if they understand the mechanism. In our experiment, however, with a well-controlled environment, we can provide a detailed explanation to investors on how their data might be used by others, and then examine their behavioral pattern changes when their data are being “shared”.
Finally, we can examine within the “black box” by observing exactly how investors change trading behaviors, when they understand their data are being shared, i.e., the trading frequency, number of orders, incentives to hide information, etc. All these findings will contribute largely to understand the welfare loss of investors due to data sharing. Moreover, we know the profiles of investors, by risk-aversion, confidence, etc. And we can control more cleanly than in a natural setting.

Randomization Method

Randomization by a computer.

Randomization Unit

Individual subject.

Was the treatment clustered?

Yes

Experiment Characteristics

Sample size: planned number of clusters

2 schools

Sample size: planned number of observations

450 individual subjects

Sample size (or number of clusters) by treatment arms

2 schools, 450 individual subjects at least, 150 control group, 150 privately informed that trading history info shared, 150 privately informed that trading history info not shared

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

10%, 5%, and 1%, that is 1.645, 1.96, and 2.575 t stats.

Supporting Documents and Materials

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

IRB