API

Measure

measure_disparity.measure_disparity(dataset: str, save_file: str = 'df_fairness.csv')[source]

Return prediction measures of disparity with respect to groups in dataset.

Parameters:
dataset: str

A csv file storing a dataframe with one row per individual. Columns should include:

  1. model prediction: Model prediction (as a probability)

  2. binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored)

  3. model label: Model label

  4. sample weights: Sample weights

  5. additional columns are demographic data on protected and reference classes

save_file: str, default: df_fairness.csv

The name of the save file.

Mitigate

mitigate_disparity.mitigate_disparity(dataset: str, protected_features: list[str], starting_point: str | None = None, save_file: str = 'estimator.pkl')[source]

“mitigate_disparity.py” takes in a model development dataset (training and test datasets) that your algorithm has not seen before and generates a new, optimally fair/debiased model that can be used to make new predictions.

Parameters:
dataset: str

A csv file storing a dataframe with one row per individual. Columns should include: 1. binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored) 2. sample weights: Sample weights. These are ignored. 3. All additional columns are treated as features/predictors.

protected_features: list[str]

The columns of the dataset over which we wish to control for fairness.

starting_pointstr | None

Optionally start from a checkpoint file with this name.

save_file: str, default: estimator.pkl

The name of the saved estimator.

Returns:
estimator.pkl: file containing sklearn-style Estimator

Saves a fair/debiased model object, taking the form of a sklearn-style python object.