Abstract
We integrate a specialized AI course into the Pakistan Judicial Academy's "Technology and Law” initiative. We will design and build an AI-based judge support tool that empowers judges to search, cite, and summarize the history of Pakistan precedents as well as newly submitted briefs and other documents. The system, especially with the associated training, is protected against plagiarism, hallucination, and providing false citations. The tool and training will be provided in the context of a randomized field experiment, equipping about one-third of Pakistan's trial court judges with generative AI technology and associated training and support. Our research will evaluate the effect of AI technology and training on judge performance, including AI usage, perceived AI usefulness, and quality measures constructed from written rulings. Our study's findings have the possibility to shed light on the potential of generative AI to bolster state capabilities and judicial productivity worldwide.
*Update to Trial - Rerandomization in October 2024
The random assignment of judges was conducted in two distinct waves for registration of judges into JudgeGPT subscriptions.
In February 2024, the first wave of registration saw 979 judges from Pakistan's lower courts sign up to participate in our experiment. We randomly assigned these 979 judges into two groups: 487 judges were allocated to the treatment group (Batch 1) and provided with access to the JudgeGPT subscription and GPT instruction course, while the remaining 492 judges were designated as the control group (Batch 2), scheduled to receive the same access in September 2024. This setup allows for a randomized control trial comparing the outcomes of Batch 1 and Batch 2. Following the initial random assignment, the introduction of the password-protected JudgeGPT, designed specifically to prevent spillovers, sparked considerable interest among judges who had not initially registered for the course but were nonetheless eager to participate but could not access GPT or the course.
An additional 580 judges expressed interest in the course and the JudgeGPT tool. To preserve the study's integrity, we decided against adding these new applicants to our control group (Batch 2), as they were not randomly assigned. Therefore, a second randomization was conducted to maintain the integrity of the study and increase its statistical power, accommodating a total of 1559 judges instead of the initially registered 979. This means more than 50% of the trial court judges (court of first instance) in Pakistan registered to participate in our experiment.
In October, the second wave of randomization, therefore, took place on October 23, 2024, for 580 judges. The 580 judges were randomly assigned to Batch 3 (n = 218), which will take the course in December 2024 and January 2025, and Batch 4 (n = 362). Batch 3 judges would get the same treatment as Batch 1 and 2: JudgeGPT course and JudgeGPT subscription. Batch 4, however, is further randomized into two subgroups: Batch 4a and Batch 4b. Batch 4a is randomly assigned to receive JudgeGPT training and a placebo course on Technology and Law in December 2024 and January 2025, along with a GPT subscription (and an anti-hallucination warning in GPT). Batch 4b will also take the generic Technology and Law course during the same period but will not receive a GPT subscription. The key difference is that Batch 4a will have access to the GPT subscription with a hallucination warning, while Batch 4b will not. Both groups, however, will attend the Generic Law and Technology classes at the same time that Batch 3 is receiving the JudgeGPT course. This will allow us to assess the impact of access to GPT tools on judges' learning and decision-making. Please see Figure 1 and other details in the Pre-analysis plan document for a summary of the experimental design and more details.