Revenue Gains from Adopting Machine Learning: Causal Evidence from Tax Audit Selection

Last registered on January 27, 2023

Pre-Trial

Trial Information

General Information

Title
Revenue Gains from Adopting Machine Learning: Causal Evidence from Tax Audit Selection
RCT ID
AEARCTR-0010813
Initial registration date
January 25, 2023

Initial registration date is when the trial was registered.

It corresponds to when the registration was submitted to the Registry to be reviewed for publication.

First published
January 27, 2023, 2:29 AM EST

First published corresponds to when the trial was first made public on the Registry after being reviewed.

Locations

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information

Primary Investigator

Affiliation
University of Copenhagen

Other Primary Investigator(s)

PI Affiliation
University of Copenhagen

Additional Trial Information

Status
In development
Start date
2023-01-26
End date
2023-11-01
Secondary IDs
Prior work
This trial does not extend or rely on any prior RCTs.
Abstract
The aim of this study is to explore the potential of applying machine learning (ML) to improve the efficiency of tax auditing and increase revenue gains through targeted audits. Together with the Danish Tax Agency, we are assessing the use of ML in identifying non-compliant claims of dividend withholding tax by training a model on past audits and comparing claims with different risk scores. Furthermore, we are examining the use of ML in prioritizing claims across various types of audits e.g. full-scope and limited-scope, by randomly assigning claims with different risk scores to different types of audits. The goal of this study is to showcase how machine learning can improve the selection process for dividend withholding tax claims.
External Link(s)

Registration Citation

Citation
Bjerre-Nielsen, Andreas and Tobias Gabel Christiansen. 2023. "Revenue Gains from Adopting Machine Learning: Causal Evidence from Tax Audit Selection." AEA RCT Registry. January 27. https://doi.org/10.1257/rct.10813-1.0
Sponsors & Partners

There is information in this trial unavailable to the public. Use the button below to request access.

Request Information
Experimental Details

Interventions

Intervention(s)
The intervention involves selecting dividend tax withholding refund claims for different types of audits (e.g. limited vs. full-scope), and it is performed by the Danish Tax Agency. The Danish Tax Agency may request different or further documentation from the shareholder/applicant about the dividend distributions if deemed it necessary.
Intervention Start Date
2023-01-26
Intervention End Date
2023-11-01

Primary Outcomes

Primary Outcomes (end points)
We use the following outcomes:
i) A binary variable indicating whether a claim is non-compliant
ii) The audit adjustment measured in DKK
iii) The cost of the audit measured in DKK
iv) The net-revenue in DKK (e.g. audit adjustment minus audit cost)
Primary Outcomes (explanation)

Secondary Outcomes

Secondary Outcomes (end points)
Secondary Outcomes (explanation)

Experimental Design

Experimental Design
We have designed a randomized control trial (RCT) in collaboration with the Danish Tax Agency, which is the main agency within the Danish tax authorities. The focus of the project is exclusively on auditing applications for refunds of dividend withholding tax. As the refunds are a way of committing tax fraud, the main purpose of the audits is to combat fraud by preventing and discouraging it. Currently, the tax agency conducts various types of audits, varying in depth and scope.

Our RCT design consists of selecting claims for different types of audit based on the predicted probability of non-compliance. To select claims based on the probability of non-compliance, we constructed machine learning models using historical data on refund applications, including information about the applications and the audit outcomes (i.e. compliant or non-compliant). We use one of these models to predict the probability of non-compliance for unprocessed claims.
Experimental Design Details
Not available
Randomization Method
Randomization done in office by a computer.
Randomization Unit
Shareholder claim (up to 20 different dividend distributions on one claim form that relate to the same shareholder).
Was the treatment clustered?
No

Experiment Characteristics

Sample size: planned number of clusters
The treatment is not clustered.
Sample size: planned number of observations
1008 shareholder claims.
Sample size (or number of clusters) by treatment arms
1008 claims in total distributed as follows:
258 claims to analyze the effect of up-prioritizing predicted high-risk claims for full-scope audit
250 claims to analyze the effect of down-prioritizing predicted low-risk claims for limited-scope audit.
500 claims to analyze the models ability to predict non-compliance.
Minimum detectable effect size for main outcomes (accounting for sample design and clustering)
IRB

Institutional Review Boards (IRBs)

IRB Name
IRB Approval Date
IRB Approval Number