Experimental Design Details
Portfolio allocation task:
The task consists of assigning portfolio weights to a pre-defined universe of 13 securities: 1 US bond fund (Vanguard Total Bond Market ETF, BND) and a heterogeneous range of 12 US stocks (Berkshire Hathaway, PepsiCo, Lockheed Martin, Kimberly-Clark, Cincinnati Financial, Eastman Chemical, Air Lease, Alkermes, St Joe, Evertec, S&T Bancorp, Sturm Ruger & Company). The stocks have been chosen in a stratified sampling approach from the S&P 500 and S&P 600 indices to ensure that stocks vary with respect to size and book-to-market ratio (two of our most important financial metrics). The portfolio recommendations are supposed to coincide with the respective investor profile. Investor profiles differ with respect to their level of risk tolerance (high/low), sustainability preference (yes/no), and investment horizon (1 month/6months/12 months).
LLM data collection:
For each of the 7 LLMs, we elicit 60 (12 investor profiles * 5 experimental conditions) portfolio recommendations (i.e., vectors of portfolio weights for the 13 securities). We inject the additional information in the 4 experimental conditions other than the baseline using JSON files, for which the order of the securities are randomized. We formulate standardized prompts, which take into account recent research on the performance impact of various prompting techniques (e.g., role prompting, chain-of-thought prompting). After each request, all previous correspondence with the model is deleted to avoid learning effects.
Human financial advisor collection:
As a human benchmark, we will elicit portfolio recommendations for all 12 profiles from 100 human financial advisors (targeted number, subject to response rate) in an incentivized online survey. Respondents are recruited via Prolific, and participation is restricted to financial advisors residing in the US. Respondents are paid a fixed remuneration of 3 GBP (=3,90 USD) for completion of the survey, which takes 15-20 minutes to complete. In addition, respondents are paid a variable remuneration of up to 3 GBP (=3,90 USD) based on the performance of their portfolio recommendations. In particular, one of the 12 investor profiles will be randomly selected, and all survey respondents will be ranked according to the risk-adjusted performance of their portfolios for this profile. Those with higher risk-adjusted performance will receive higher variable payments. We also include an attention check (which risk tolerance did the previous investor profile display?) and questions targeting the respondents' experience in the financial advisory industry and financial knowledge. We will use the attention check (as well as response time) as a potential indicator for inattentiveness and restrict our analyses to attentive respondents as a robustness check. We might also explore heterogeneity in the quality of recommendations based on experience or financial knowledge.