API¶

Measure¶

measure_disparity.measure_disparity(dataset: str, save_file: str = 'df_fairness.csv')[source]¶

Return prediction measures of disparity with respect to groups in dataset.

Parameters:

dataset: str

A csv file storing a dataframe with one row per individual. Columns should include:

model prediction: Model prediction (as a probability)
binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored)
model label: Model label
sample weights: Sample weights
additional columns are demographic data on protected and reference classes

save_file: str, default: df_fairness.csv

The name of the save file.

Mitigate¶

mitigate_disparity.mitigate_disparity(dataset: str, protected_features: list[str], starting_point: str | None = None, save_file: str = 'estimator.pkl')[source]¶

“mitigate_disparity.py” takes in a model development dataset (training and test datasets) that your algorithm has not seen before and generates a new, optimally fair/debiased model that can be used to make new predictions.

Parameters:

dataset: str: A csv file storing a dataframe with one row per individual. Columns should include: 1. binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored) 2. sample weights: Sample weights. These are ignored. 3. All additional columns are treated as features/predictors.
protected_features: list[str]: The columns of the dataset over which we wish to control for fairness.
starting_pointstr | None: Optionally start from a checkpoint file with this name.
save_file: str, default: estimator.pkl: The name of the saved estimator.

Returns:

estimator.pkl: file containing sklearn-style Estimator: Saves a fair/debiased model object, taking the form of a sklearn-style python object.