API¶
Measure¶
- measure_disparity.measure_disparity(dataset: str, save_file: str = 'df_fairness.csv')[source]¶
Return prediction measures of disparity with respect to groups in dataset.
- Parameters:
- dataset: str
A csv file storing a dataframe with one row per individual. Columns should include:
model prediction: Model prediction (as a probability)
binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored)
model label: Model label
sample weights: Sample weights
additional columns are demographic data on protected and reference classes
- save_file: str, default: df_fairness.csv
The name of the save file.
Mitigate¶
- mitigate_disparity.mitigate_disparity(dataset: str, protected_features: list[str], starting_point: str | None = None, save_file: str = 'estimator.pkl')[source]¶
“mitigate_disparity.py” takes in a model development dataset (training and test datasets) that your algorithm has not seen before and generates a new, optimally fair/debiased model that can be used to make new predictions.
- Parameters:
- dataset: str
A csv file storing a dataframe with one row per individual. Columns should include: 1. binary outcome: Binary outcome (i.e. 0 or 1, where 1 indicates the favorable outcome for the individual being scored) 2. sample weights: Sample weights. These are ignored. 3. All additional columns are treated as features/predictors.
- protected_features: list[str]
The columns of the dataset over which we wish to control for fairness.
- starting_pointstr | None
Optionally start from a checkpoint file with this name.
- save_file: str, default: estimator.pkl
The name of the saved estimator.
- Returns:
- estimator.pkl: file containing sklearn-style Estimator
Saves a fair/debiased model object, taking the form of a sklearn-style python object.