Theory-Based Acquisition Strategies – an experimental analysis of AI impact on decision-making in M&As.

Last registered on October 31, 2025

View Trial History

Pre-Trial

Trial Information

General Information

Title

Theory-Based Acquisition Strategies – an experimental analysis of AI impact on decision-making in M&As.

RCT ID

AEARCTR-0016459

Initial registration date

October 31, 2025

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published

October 31, 2025, 9:28 AM EDT

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

Country

Italy

Region

Milan

Primary Investigator

Name

Arnaldo Camuffo

Affiliation

ICRIOS, Bocconi University

Contact Primary Investigator

Other Primary Investigator(s)

PI Name

Alfonso Gambardella

PI Affiliation

Bocconi University

Contact Investigator

PI Name

Carlos Morales

PI Affiliation

Bocconi University

Contact Investigator

PI Name

Abhinav Pandey

PI Affiliation

Bocconi University

Contact Investigator

PI Name

Saeid Kazemi

PI Affiliation

Bocconi University

Contact Investigator

Additional Trial Information

Status

In development

Start date

2025-11-24

End date

2026-01-11

Keywords

Firms & Productivity

Additional Keywords

artificial intelligence, mergers & acquisitions, strategic decision making

JEL code(s)

O33

Secondary IDs

Prior work

This trial does not extend or rely on any prior RCTs.

Abstract

This randomized controlled trial investigates how artificial intelligence (AI) assistance influences strategic reasoning in mergers and acquisitions (M&A). The study tests whether managers trained in the Theory-Based View (TBV) of strategy produce higher-quality causal theories when aided by general-purpose or “agentic” theory-driven AI systems.

Three experimental arms are implemented with 300 experienced managers from the MedTech, Biotech, and High-Tech industries: (1) Control – TBV training plus Google Search; (2) Intervention 1 – TBV training plus ChatGPT (general-purpose large language model); and (3) Intervention 2 – TBV training plus "Aristotle", an agentic AI developed at Bocconi University that applies TBV reasoning principles. Participants complete a brief online training, solve an M&A challenge, and report their expected probability of success and confidence in their proposed strategy.

Primary outcomes are (a) theory quality, rated by blinded experts and AI evaluators (0–10), and (b) expected probability of success (0–10). Secondary measures include theory causality, confidence, AI aversion, complacency, and interaction quality. Randomization uses minimized allocation in blocks of 20 (stratified by education, experience, and AI aversion).

Key hypotheses (one-sided) test whether: H1) LLM > Control, H2) Aristotle > Control, and H3) Aristotle > LLM. With N = 300 (70 Control, 115 LLM, 115 Aristotle), the design achieves 81% power for Cohen's d = 0.4 (α = 0.05, Holm-adjusted).

The study is approved by Bocconi University’s IRB, conducted anonymously and online, with minimal risk and debriefing for all participants. Results will be made publicly available upon completion.

External Link(s)

Registration Citation

Citation

Camuffo, Arnaldo et al. 2025. "Theory-Based Acquisition Strategies – an experimental analysis of AI impact on decision-making in M&As.." AEA RCT Registry. October 31. https://doi.org/10.1257/rct.16459-1.0

Sponsors & Partners

Experimental Details

Interventions

Intervention(s)

The experiment implements a three-arm randomized controlled trial (RCT) with two parallel AI interventions and one control group. All participants first receive a short (3-minute) online training video introducing the Theory-Based View (TBV) of strategy, emphasizing causal reasoning in strategic decision-making. After viewing the training, participants complete an M&A decision challenge and develop a brief written acquisition strategy.

Arm 1 – Control (TBV + Google Search):
Participants complete the TBV training and address the M&A challenge using only their own reasoning and publicly available information via Google Search. No AI assistance is provided.

Arm 2 – TBV + ChatGPT (General-Purpose LLM):
Participants complete the TBV training and use OpenAI’s ChatGPT (O3-mini reasoning model) as a general-purpose large language model to assist them in researching, formulating, and refining their strategic theory before submitting their final decision.

Arm 3 – TBV + Aristotle (Agentic AI):
Participants complete the TBV training and use "Aristotle", a specialized agentic AI system developed at Bocconi-IMSL that applies TBV reasoning principles. The agent autonomously supports causal reasoning and theory construction, providing targeted feedback and prompting to improve strategic coherence.

All other procedures, materials, and timing are identical across conditions. Total participation time is approximately 45 minutes. Interventions are delivered online via the Qualtrics platform. Randomization is implemented automatically within the survey workflow using minimized allocation to maintain covariate balance across education, experience, and baseline AI aversion.

Intervention (Hidden)

Intervention Start Date

2025-11-24

Intervention End Date

2026-01-11

Primary Outcomes

Primary Outcomes (end points)

1. Theory Quality (0–10 scale):
The main outcome variable measuring the overall quality, soundness, and feasibility of each participant’s strategic theory or acquisition plan. Responses are coded blind to treatment condition by independent expert judges using a standardized rubric (0 = very poor; 10 = excellent). A parallel evaluation using a large language model (LLM-as-judge) provides robustness checks and inter-rater reliability comparisons.

2. Expected Probability of Success (0–10 scale):
The participant’s self-assessed likelihood that their proposed strategy would succeed in practice, where each scale point corresponds to a 10% probability increment (e.g., 1 = 10%, 10 = 100%).

Both outcomes are collected post-intervention within the same Qualtrics session. Theory quality captures the objective reasoning quality of the written strategy, while expected probability of success reflects the subjective confidence in its predicted outcome.

The primary treatment effects are estimated through pairwise contrasts between:

* (a) ChatGPT vs. Control,
* (b) Aristotle vs. Control, and
* (c) Aristotle vs. ChatGPT.

All tests are one-sided (directional hypotheses: LLM > Control, Aristotle > Control, Aristotle > LLM) with family-wise error rate controlled at α = 0.05 using the Holm adjustment.

Primary Outcomes (explanation)

"Theory Quality" measures the participant’s ability to develop a coherent, feasible, and logically structured strategic theory in response to an M&A decision challenge. This metric operationalizes the quality of reasoning rather than the correctness of the answer. Each written submission is rated independently by multiple expert judges who are blind to treatment condition. Judges evaluate causal logic, internal consistency, and theoretical soundness using a 0–10 scale. As a robustness check, the same text responses are also evaluated by a large language model (LLM-as-judge) following identical criteria to assess reliability and potential bias. This dual human-AI evaluation approach follows current best practices in experimental strategy research.

"Expected Probability of Success" captures the participant’s subjective assessment of how likely their proposed strategy would succeed if implemented in the real world. Immediately after completing the decision task, participants report this probability on a 0–10 scale. This measure reflects perceived decision confidence and complements the objective theory quality scores.

Together, these two outcomes assess the main theoretical proposition: that exposure to AI assistance—particularly to a theory-based agentic AI—enhances both the objective quality of strategic reasoning and the subjective confidence participants have in their causal models of success.

Secondary Outcomes

Secondary Outcomes (end points)

1. Theory Causality (0–10 scale):
Captures the degree to which participants identify explicit cause-and-effect mechanisms linking strategic actions to expected outcomes. Rated by blinded human judges and an LLM-as-judge using a standardized rubric. Higher scores indicate stronger causal reasoning and clearer logic chains in the proposed acquisition strategy.

2. Confidence in Theory (7-point Likert scale):
Self-reported measure of how confident participants feel about the soundness and internal consistency of their strategic solution (1 = not confident at all; 7 = extremely confident). This complements the expected probability of success and helps distinguish confidence calibration from objective performance.

3. AI Aversion (7-point Likert scale):
Participants’ self-reported discomfort, distrust, or reluctance to rely on AI systems in decision-making. Measured at baseline and post-intervention to assess changes resulting from AI exposure.

4. AI Complacency / Automation Bias (7-point Likert scale):
Measures participants’ degree of overreliance or excessive trust in AI-generated suggestions. Collected pre- and post-intervention to evaluate shifts in cognitive reliance patterns.

5. Human–AI Interaction Quality:
Behavioral data automatically logged in the AI-assisted arms, including number of AI queries, time spent interacting, and prompting hygiene (clarity, specificity, and iteration depth). These variables serve as moderators of treatment effects on theory quality and causality.

6. Qualitative Post-Experiment Feedback:
Open-ended debrief responses coded for perceived usefulness, trust, and transparency of AI assistance. These serve as exploratory endpoints informing future replication and design refinements.

Secondary Outcomes (explanation)

The secondary outcomes deepen the analysis of how AI assistance influences the mechanisms underlying strategic reasoning and decision confidence.

"Theory Causality" evaluates the extent to which participants articulate explicit causal mechanisms linking strategic choices to anticipated outcomes. This measure captures "depth of reasoning"—how well participants explain why their proposed acquisition strategy would work, not just what should be done. Expert raters and a large language model (LLM-as-judge) independently assign scores (0–10) based on predefined causal-logic criteria.

"Confidence in Theory" measures participants’ subjective confidence in the validity and coherence of their reasoning using a 7-point Likert scale. It serves as a psychological correlate to "Expected Probability of Success" and helps identify overconfidence or underconfidence relative to objective performance.

"AI Aversion" and "AI Complacency (Automation Bias)" quantify attitudinal changes toward AI. "AI aversion" reflects distrust or discomfort with using AI tools, while "AI complacency" reflects overreliance or uncritical acceptance of AI outputs. Both are measured pre- and post-intervention to detect whether AI exposure reduces aversion and/or increases automation bias, as hypothesized.

"Human–AI Interaction Quality" captures behavioral engagement metrics—number and depth of AI prompts, time spent interacting, and prompting hygiene. These process-level variables serve as moderators and potential mediators of treatment effects, identifying how interaction style shapes learning and performance outcomes.

Finally, "Qualitative Feedback" from open-ended post-experiment questions provides exploratory evidence about participants’ perceived usefulness, trust, and transparency of AI assistance, informing future replication and external validity.

Experimental Design

The study is a three-arm randomized controlled trial (RCT) designed to test how different forms of AI assistance influence strategic reasoning quality and confidence among managers making M&A decisions.

All participants first receive a standardized 3-minute training video introducing the *Theory-Based View (TBV)* of strategy, which teaches causal reasoning and theory formulation principles. Immediately after training, participants complete an online M&A case challenge requiring them to develop a brief acquisition strategy and justify their reasoning.

Participants are randomly assigned (via minimized randomization) to one of three experimental arms:

1. Control Group – TBV + Google Search: Participants complete the M&A task using only Google Search and their own reasoning.
2. "Intervention 1 – TBV + LLM:" Participants use a general-purpose large language model to assist with information gathering, idea refinement, and theory formulation.
3. "Intervention 2 – TBV + Agentic AI:" Participants use a specialized agentic AI that provides structured guidance and feedback grounded in causal reasoning principles.

The experiment is conducted fully online using the Qualtrics platform. Total participation time is approximately 45 minutes.

After completing the task, participants answer post-intervention surveys measuring subjective outcomes (confidence, expected probability of success, AI attitudes) and provide qualitative feedback. Written responses are later coded blind to condition by expert judges and by an LLM-as-evaluator for objectivity and robustness checks.

The study design allows direct comparison of (a) general AI vs. no AI, (b) agentic AI vs. no AI, and (c) agentic AI vs. general AI.

Experimental Design Details

Randomization Method

Randomization is conducted in two stages and is implemented via the Qualtrics platform, using stratified randomization on key covariates to ensure balanced groups.

Intervention assignment:
After the baseline survey, participants are allocated to interventions in the order of enrollment. The first ∼120 qualified respondents are assigned to Intervention 1 (TBV training vs. placebo), and the next ∼120 respondents are assigned to Intervention 2(AI assistance). Within the first group (Intervention 1), participants are randomized 1:1 to TBV training vs. placebo. We use stratification on important covariates – gender, field of education, years of experience, and baseline AI aversion – to achieve balance between the TBV and placebo groups and explore potentially meaningful heterogeneous treatment effects. Similarly, within the second group (Intervention 2), participants are randomized 1:1 into the two AI conditions (General AI vs. Agentic AI), again using stratified randomization on the same covariates to ensure balanced characteristics across these AI groups and enable heterogeneous treatment effects analysis. (No new participants are directly assigned to a “No AI” condition in Intervention 2, since the No AI comparison group consists of the TBV-trained participants from Intervention 1.) We will record the randomization procedure with software logs, including random seeds and assignment timestamps, to ensure transparency.

This stratified approach prevents detectable imbalances in meaningful observable characteristics and upholds group equivalence. All participants provide informed consent before randomization. We will verify ex-post that the groups are balanced on baseline covariates (e.g., demographics, experience, etc.), and if any notable imbalance arises by chance, we will control for those covariates in the analysis as a precaution.

Randomization Unit

Individual

Was the treatment clustered?

Experiment Characteristics

Sample size: planned number of clusters

300 senior managers

Sample size: planned number of observations

300 participants

Sample size (or number of clusters) by treatment arms

Our final target sample size is 300 participants in total: approximately 70 in the control group, 115 in Intervention 1 and 115 in Intervention 2 (yielding about 115 observations in each key experimental group, as described).

Minimum detectable effect size for main outcomes (accounting for sample design and clustering)

The design can detect a small to moderate effect size (Cohen’s d = 0.4) with sufficient power. Assuming a one-tailed test with family-wise α = 0.05, desired power (1 − β) = 0.81, and a repeated-measures design (each participant provides a pre- and post-score) with an expected moderate pre-post correlation (around ρ = 0.5–0.7).

Supporting Documents and Materials

IRB