Primary Outcomes (explanation)
(1) Performance on the main task: we will measure performance on a 0-10 scale, evaluated along two dimensions: (a) content, and (b) writing. The main outcome will be an overall score, calculated as a weighted average (2/3 content, 1/3 writing). We will also analyze each dimension separately. All scores will be standardized relative to the control group.
We will use an LLM for grading. To ensure reliability, we will manually grade a random subset of responses (10%) and compare these scores with those produced by the LLM. The grading scheme for both dimensions is as follows:
*Content (0-10 points):
-Diagnosis (6 points): Up to 2 points per question for each of the 3 questions posed by the business manager/owner.
-Solution (4 points): 1 point for addressing the root cause of the problem, up to 2 points for making a concrete and specific proposal, and 1 point for realism. If the solution does not address the root cause, the respondent scores 0 for the solution subsection.
*Writing (0-10 points):
-Spelling and grammar (2 points)
-Clarity and legibility (3 points)
-Organization (4 points)
-Tone and register (1 point)
Responses that are less than 200 characters long and receive 0 points in content will also receive 0 points in writing.
An additional outcome variable measuring performance in the main task is a binary indicator equal to 1 if the respondent correctly identifies the root cause of the problem, defined as obtaining at least 1 point in the third diagnostic question posed by the business manager.
(2) Time to complete the main task: We will record the time taken to complete the task (in minutes), with values top-coded at 20 minutes.
(3) Performance on the follow-up questions: We will construct a weighted average of two indicators: (i) Score in the open-ended follow-up question, graded on a 0-2 scale, (ii) binary variables for whether the respondent correctly responded each of the two multiple choice questions. The main outcome will be an overall score, calculated as a weighted average (1/2 open-ended question, and ¼ each multiple-choice question), standardized relative to the control group. We will also analyze each dimension separately.