Algorithms & Credit Outcomes

Last registered on December 22, 2020

Pre-Trial

Trial Information

General Information

Title
Algorithms & Credit Outcomes
RCT ID
AEARCTR-0005187
Initial registration date
December 16, 2020

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
December 16, 2020, 9:56 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Last updated
December 22, 2020, 3:09 AM EST

Last updated is the most recent time when changes to the trial's registration were published.

Locations

Region

Primary Investigator

Affiliation
Princeton University

Other Primary Investigator(s)

Additional Trial Information

Status
In development
Start date
2020-12-14
End date
2021-06-30
Secondary IDs
Abstract
This study compares the relative effectiveness of standard microcredit and digital credit in minimizing borrower default and producing efficient lending outcomes. In partnership with a financial technology company based in Pakistan, I implement a randomized trial that randomizes the agent - i.e., loan officer or machine learning algorithm - as well as the information available to that agent in making a credit decision. The results from this intervention will shed light on machine learning techniques' potential, if any, to reduce borrower default relative to a human only model of credit approval in informal credit markets.
External Link(s)

Registration Citation

Citation
Kisat, Faizaan. 2020. "Algorithms & Credit Outcomes." AEA RCT Registry. December 22. https://doi.org/10.1257/rct.5187-1.2000000000000002
Sponsors & Partners

Sponsors

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
This research project compares the effectiveness of standard microcredit and digital credit in producing efficient lending outcomes. In partnership with E, a leading financial technology company based in Pakistan, I will implement a randomized controlled trial (RCT) that randomizes the agent (i.e., loan officer or machine learning algorithm) as well as the information available to that agent in making a credit decision. The results from the intervention will shed light on machine learning techniques' potential, if any, to reduce borrower default relative to a human only model of credit approval in informal credit markets.

This research project contributes to the existing literature in several meaningful ways. First, most of the research on digital credit's (or, in general, algorithms') effectiveness has centered on developed economies, where credit markets are far more complete than in Pakistan. Second, while there has been research comparing the efficiency of algorithms to humans, relatively little is known about why outcomes differ between the two. Algorithms usually have access to information that humans may not observe, and vice versa. Additionally, conditional on being exposed to the exact same information, algorithms might put a different weight on certain observable characteristics when making a lending determination.

My project's experimental design will disentangle the relative importance of these differences explicitly. Additionally, this study will directly consider the ways in which algorithms and human decision making differ for marginalized groups in Pakistan such as women and ethnic minorities. In doing so, I hope to determine the extent to which the digital credit revolution may improve financial access for the most vulnerable groups within a country.
Intervention Start Date
2021-01-04
Intervention End Date
2021-03-31

Primary Outcomes

Primary Outcomes (end points)
I will consider three main variables. The outcomes variables and their various definitions are stated below:

1. Borrower selection: An indicator variable that equals one if the loan application is approved, zero otherwise.
2. Borrower default probability: Default is defined as an indicator variable that equals one if any part of the loan amount (including interest and relevant late fees) is overdue:
(i) For more than 8 days after the due date.
(ii) For more than 30 days after the due date.
(iii) For more than 60 days after the due date.
(iv) For more than 365 days after the due date.
3. Loan margin: Total accounting and economic margin on the loan. For illustrative purposes, consider a PKR 1,000 loan with PKR 50 in charged interest. The margins are calculated as follows:
(i) Accounting margin: Total margin earned or lost on the loan. If a borrower repays the loan by the due date, then accounting margin equals PKR 50. If the borrower completely defaults on the loan (where default is defined as having the full amount overdue for more than 365 days after the due date), then accounting margin is PKR -1,000.
(ii) Economic margin: Total margin earned or lost on the loan, taking into account forgone interest. If a borrower repays the loan and interest by the due date, then economic margin equals accounting margin at PKR 50. However, in case of complete default, the economic margin is the accounting margin plus forgone interest on the defaulted principal. Forgone interest is calculated as the product of: Defaulted principal x average APR on similar principal amounts earned by E x average repayment probability.
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
Loan applications received by E will be randomly assigned to one of four treatment groups.

In the first and second groups, the exact same applicant information will be provided to a human or fed into the algorithm, respectively. In the third and fourth treatment groups, I will feed additional information usually only observed by humans and algorithms, respectively.
Experimental Design Details
Loan applications have been randomly assigned to one of four treatment groups:

Treatment Group 1 - Loan officer only: Loan officer makes standard credit decision, has no access to algorithm-generated data.
Treatment Group 2 - “Limited Information” Algorithm: Algorithm trained on questionnaire questions only makes a credit decision.
Treatment Group 3 - Loan Officer Only + Identifying Information: Identical to group 1, except that loan officer now has access to applicants' name, age, gender, pictures, and location.
Treatment Group 4 - “Full Information” Algorithm: Algorithm trained on questionnaire questions and cellphone usage data makes a credit decision.

Each loan officer will review the loans assigned to Treatment Group 1 as well as to Treatment Group 3. The officers will adjudicate loans in each treatment group on separate days, which helps to avoid potential priming concerns. Select parts of the empirical analysis include loan officer fixed effects and exploit loan level variation within loan officer to evaluate how loan outcomes differ when identifying information such as name and a picture is included in a random subset of loans.
Randomization Method
The randomization was done by the lead researcher on a computer, using a random number generator.
Randomization Unit
The unit of randomization is individual loan applications.
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
5,172 loans
Sample size: planned number of observations
5,172 observations
Sample size (or number of clusters) by treatment arms
There are 1,293 applications in each treatment arm.

The sampling design over-samples applicants from traditionally disadvantaged groups in Pakistan to determine whether algorithms and loan officers' lending decisions differ substantially across majority and minority groups. Specifically, I over-sample: 1) women, and 2) men from the Balochistan and Khyber-Pakhtunkhwa (KP) provinces, hereafter referred to as BK men. These two groups are traditionally marginalized in Pakistan and tend to have worse various socioeconomic outcomes relative to the rest of the country. The remaining under-sampled group is the "majority" group in Pakistan, consisting of men from Sindh and Punjab provinces, hereafter referred to as SP men.

The sample is evenly split across the three groups in order to increase the precision of any heterogeneity analyses, with 1,724 members from each group in the final sample. Randomization is stratified by these groups to ensure identical gender and ethnic composition across the treatments. By design therefore, there are 431 women, BK men, and SP men in each treatment group.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
Research Integrity & Assurance, Princeton University
IRB Approval Date
2020-11-18
IRB Approval Number
13283
Analysis Plan

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Post-Trial

Post Trial Information

Study Withdrawal

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Intervention

Is the intervention completed?
No
Data Collection Complete
Data Publication

Data Publication

Is public data available?
No

Program Files

Program Files
Reports, Papers & Other Materials

Relevant Paper(s)

Reports & Other Materials