Experimental Design Details
As previously explained, our project aims to achieve the following three key objectives:
1. Quantifying Individual Concerns for Data Privacy
2. Evaluating Monetary Compensation for Data Privacy
3. Analyzing Trading Profits of Individual Investors
We explain below how we study the value of individual investors’ trading data, and measure the welfare effect of data value, which has never been addressed in the literature before.
We explore the value of investors’ data by conducting a lab experiment. Our experiment is based on a trading simulation platform that mimics the environment of practitioners in financial markets, which enables us to contextualize participants’ trading decisions. Critically, the experiment is designed to study the change in the functioning of the account of the participant regarding the sharing of his/her trading activity. Furthermore, this approach allows us to precisely map a participant to his or her trading activity, which is key to empirically testing the change in trading behavior and estimating the value of investor’s data.
The main benefit of using a lab experiment is the ability to observe trading decisions at the individual level. Individual-level data allow us to precisely map information about the participants and their trading activity to the change in the functioning of the account (data sharing or not). Doing so is particularly relevant for testing our hypotheses and controlling for various factors.
The choice of a lab experiment also overcomes two main challenges of empirical studies: the lack of real data and, more critically, the difficulty of making causal inferences. Regarding the lack of data, first of all, we could not identify a natural experiment for this research question: there are no laws that we could use where trading platforms were exogenously prevented from sharing their customers’ past trading history and provide their customers with a compensation for their data. Second, surveys of investors’ preferences for privacy (see Acquisti et al. 2021) are notoriously controversial because of the “privacy paradox”: while users claim to be very concerned about their privacy, they nevertheless undertake very little to protect their personal data. In terms of causal inference, archival data would make it difficult to study the sharing of information because of the many factors occurring in financial markets. In contrast, with an experimental approach, we can expose participants to a change in data sharing which is the only manipulated variable. In addition, empirical analysis could not separate the relative roles of investors’ biases from those of their endowments and available information in accounting for their trading reaction to the change in sharing financial data. To address this issue, our lab experiment relies on a trading simulation in which each participant is endowed with the same initial portfolio (composed of stocks and cash) and faces the same news flow regarding the company. This setting allows us to isolate the role of potential confounding factors other than those related to individual preferences.
What is more, it is extremely difficult in the real world to cleanly identify whether investors are aware that their data are being “sold” to others. Trading platforms don’t explicitly explain the practice of PFOF and how investor orders are executed. As a matter of fact, most platforms provide very vague information regarding how they use investor data.
For example, they mostly do not share how they use the current order data, historical trading record data of investors. Investors remain largely uninformed regarding who’re their trading counterparties, and how their data can be used. All these factors are very important for investor expectations. Investors might consider the possibility of being front-run or back-run, if they understand the mechanism. In our experiment, however, with a well-controlled environment, we can provide a detailed explanation to investors on how their data might be used by others, and then examine their behavioral pattern changes when their data are being “shared”.
Finally, we can examine within the “black box” by observing exactly how investors change trading behaviors, when they understand their data are being shared, i.e., the trading frequency, number of orders, incentives to hide information, etc. All these findings will contribute largely to understand the welfare loss of investors due to data sharing. Moreover, we know the profiles of investors, by risk-aversion, confidence, etc. And we can control more cleanly than in a natural setting.