Challenge The Bias

The Problem

Judges are constantly making decisions about whether defendants should be released or detained while awaiting a trial.

How fair are these rulings?

Human beings are easily biased, and studies have suggested that external factors (like a lunch break or a tough football loss) can sway their decisions. In the age of big data, it’s tempting to imagine that using a computer to make rulings might help counteract our own biases. In fact, “risk scores” generated by algorithms are used nationwide [1][2]. When used as black boxes, however, algorithms are no better (and maybe much worse) in the biases they propagate. A recent study of defendants in Broward County, Florida showed that Black defendants are far more likely to be assigned a high-risk score [3].

The Users

Data contains valuable information, but we need to understand how to interpret it, use it, and recognize its consequences. We believe that taking the time to understand the effects of using these risk scores with different thresholds will allow judges, lawyers and policy-makers to use data-driven models to make less biased decisions in the criminal justice system.

Factors

Variables to consider when calculating a risk score

Gender

Age

Race

COMPAS recidivism score

COMPAS violent recidivism score

What is Fair?

“Fair” is a word we often throw around, but determining what is the most fair decision involves a lot of tricky tradeoffs to think about. We consider three types of fairness, and compare how models can be interpreted in each framework.

Equal Thresholds: Given an algorithmically-generated risk score, we say that any two people with the same risk score have the same ruling. For example, we could decide that any defendant, regardless of race, gender, or other factor, will be detained if their risk score is about 0.6.

Equal Detention Rates: Given two populations (i.e. male and female, or black and white), we want to detain an equal rate of people from both populations. This necessarily means we want different thresholds for different populations.

Equal False Positive Rates: Given two populations, we want to choose thresholds per population such that we enforce equal false positive rates (FPR = the fraction of people who did not reoffend who were detained wrongfully).

Solution and Impact

To make an unbiased decision based on a computed risk score, the three steps to consider:

1.) Data collection
2.) Data modelling (generating a risk score)
3.) Optimal decision-making based on risk score

We address the second and third points, showing that adding certain variables to an algorithm can make it more fair and how to make optimal decisions based on different concepts of fairness. By creating interpretable visualizations of these concepts, we hope to make fair, data-driven models easier to understand and adopt.

Team

Sam Corbett-Davies (Stanford)
Danielle Dean (Microsoft)
Lorenzo Vitale (BU)
Aditthya Ramakrishnan (CMU/Next Tech Lab)
Anshuman Pandey (CMU/Next Tech Lab)
Harini Suresh (MIT)
Yunxin Fan (Harvard)
Yaovi Ayeh (Dell EMC)
Frances Ding (Harvard)
Marina S. (community)

Challenge The Bias

The Problem

The Users

Current Models

Factors

Gender

Age

Race

COMPAS recidivism score

COMPAS violent recidivism score

What is Fair?

Gender Risk

Accounting for Gender

Equal Detention Rates

Equal False Positive Ratio

Equal Thresholds

Solution and Impact

Team