API¶

Estimators¶

class fomo.FomoClassifier(estimator: ~sklearn.base.ClassifierMixin = LogisticRegression(), fairness_metrics=None, accuracy_metrics=None, algorithm: ~pymoo.core.algorithm.Algorithm = <pymoo.algorithms.moo.nsga2.NSGA2 object>, random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint=False, picking_strategy='PseudoWeights')[source]¶

FOMO Classifier.

Train a population of self.estimator models with random weights.

Update sample weights using self.algorithm.

Select a given model as best, but also save the set of models.

Parameters:

estimatorsklearn-like estimator: The underlying ML model to be trained. The ML model must accept sample_weight as an argument to fit().
fairness_metricslist[Callable]: The fairness metrics to try to optimize during fitting.
accuracy_metricslist[Callable]: The accuracy metrics to try to optimize during fitting.
algorithm: pymoo Algorithm: The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.
random_state: int | None: Random seed.
verbose: bool: Whether to print progress.
n_jobs: int: Number of parallel processes to use. Parallelizes evaluation.
store_final_models: bool: If True, the final set of models will be stored in the estimator.
problem_type: ElementwiseProblem: Determines the evaluation class to be used. Options: - BasicProblem - MLPProblem - LinearProblem

Examples

>>> from fomo import FomoClassifier
>>> from pmlb import pmlb
>>> X,y = pmlb.fetch_data('adult', return_X_y=True)
>>> groups = ['race','sex']
>>> est = FomoClassifier()
>>> est.fit(X,y, protected_features=groups)

fit(X, y, grouping='intersectional', abs_val=False, gamma=True, protected_features=None, Xp=None, **kwargs)[source]¶

Train the model.

Train a population of self.estimator models with random weights.
Update sample weights using self.algorithm.
Select a given model as best, but also save the set of models.

Parameters:

Xarray-like, shape (n_samples, n_features): The training input samples.
yarray-like, shape (n_samples,): The target values. An array of int.
protected_features: list[str]|None, default = None: The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.
Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None: The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.
**kwargspassed to pymo.optimize.minimize.

Returns:

selfobject: Returns self.

predict(X)[source]¶

Predict labels from X.

Parameters:

Xarray-like, shape (n_samples, n_features): The input samples.

Returns:

yndarray, shape (n_samples,): The label for each sample is the label of the closest sample seen during fit.

predict_proba(X)[source]¶: Return prediction probabilities.

predict_proba_archive(X)[source]¶: Return a list of predictions from the archive models.

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for Xp parameter in fit.
abs_valstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for abs_val parameter in fit.
gammastr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for gamma parameter in fit.
groupingstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for grouping parameter in fit.
protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for protected_features parameter in fit.

Returns:

selfobject: The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → FomoClassifier¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns:

selfobject: The updated object.

class fomo.FomoRegressor(estimator: ~sklearn.base.RegressorMixin = SGDRegressor(), fairness_metrics: list[str] = None, accuracy_metrics: list[str] = None, algorithm: str = 'NSGA2', random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint: bool = False, picking_strategy: str = 'PseudoWeights')[source]¶

Fomo class for regression models.

Parameters:

estimatorsklearn-like estimator

The underlying ML model to be trained. The ML model must accept sample_weight as an argument to fit().

fairness_metricslist[Callable]

The fairness metrics to try to optimize during fitting.

accuracy_metricslist[Callable]

The accuracy metrics to try to optimize during fitting.

algorithm: pymoo Algorithm

The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.

random_state: int | None

Random seed.

verbose: bool

Whether to print progress.

n_jobs: int

Number of parallel processes to use. Parallelizes evaluation.

store_final_models: bool

If True, the final set of models will be stored in the estimator.

problem_type: ElementwiseProblem

Determines the evaluation class to be used. Options:

BasicProblem: loss function weights are directly optimized.
MLPProblem: weights of a multilayer perceptron are optimized to estimate loss function weights.
LinearProblem: weights of a logistic model are optimized to estimate loss function weights.

Examples

>>> from fomo import FomoRegressor
>>> from pmlb import pmlb
>>> X,y = pmlb.fetch_data('adult', return_X_y=True)
>>> groups = ['race','sex']
>>> est = FomoRegressor()
>>> est.fit(X,y, protected_features=groups)

Attributes:

n_features_int: The number of features of the data passed to fit().

fit(X, y, protected_features=None, Xp=None, **kwargs)[source]¶

Train a set of regressors.

Parameters:

Xarray-like, shape (n_samples, n_features): The training input samples.
yarray-like, shape (n_samples,): The target values. An array of int.
protected_features: list|None, default = None: The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.
Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None: The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.

Returns:

selfobject: Returns self.

predict(X)[source]¶

Predict outcome.

Parameters:

Xarray-like, shape (n_samples, n_features): The input samples.

Returns:

yndarray, shape (n_samples,): The label for each sample is the label of the closest sample seen during fit.

set_fit_request(*, Xp: bool | None | str = '$UNCHANGED$', protected_features: bool | None | str = '$UNCHANGED$') → FomoRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for Xp parameter in fit.
protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for protected_features parameter in fit.

Returns:

selfobject: The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → FomoRegressor¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns:

selfobject: The updated object.

Problems¶

Fairness Oriented Multiobjective Optimization (Fomo)

BSD 3-Clause License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

class fomo.problem.BasicProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶

The evaluation function for each candidate sample weights.

Attributes:

n_constr

Methods

bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set

class fomo.problem.InterLinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶

The evaluation function for each candidate weights.

Attributes:

n_constr

Methods

bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set

class fomo.problem.LinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶

The evaluation function for each candidate weights.

Attributes:

n_constr

Methods

bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set

class fomo.problem.MLPProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶

The evaluation function for each candidate weights.

Attributes:

n_constr

Methods

bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set

class fomo.problem.SurrogateProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶

The evaluation function for each candidate weights.

Attributes:

n_constr

Methods

bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set

Metrics¶

Fairness Oriented Multiobjective Optimization (Fomo)

BSD 3-Clause License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

fomo.metrics.FNR(y_true, y_pred)[source]¶

Returns False Negative Rate.

Parameters:

y_true: array-like, bool: True labels.
y_pred: array-like, float or bool: Predicted labels.
If y_pred is floats, this is the “soft” false negative rate
(i.e. the average probability estimate for the negative class)

fomo.metrics.FPR(y_true, y_pred)[source]¶

Returns False Positive Rate.

Parameters:

y_true: array-like, bool: True labels.
y_pred: array-like, float or bool: Predicted labels.
If y_pred is floats, this is the “soft” false positive rate
(i.e. the average probability estimate for the negative class)

fomo.metrics.TPR(y_true, y_pred)[source]¶

Returns True Positive Rate.

Parameters:

y_true: array-like, bool: True labels.
y_pred: array-like, float or bool: Predicted labels.
If y_pred is floats, this is the “soft” true positive rate
(i.e. the average probability estimate for the positive class)

fomo.metrics.differential_calibration_loss(estimator, X, y_true, groups=None, X_protected=None, n_bins=None, bins=None, stratified_categories=None, alpha=0.0, gamma=0.0, rho=0.0)[source]¶: Return the differential calibration of estimator on groups.

fomo.metrics.flex_loss(estimator, X, y_true, metric, **kwargs)[source]¶

Parameters:

estimatorsklearn-like estimator: The underlying ML model to be trained.
Xarray-like, shape (n_samples, n_features): The training input samples.
y_true: array-like, bool: True labels.
metric: string or function: The loss function. Could be FPR or FNR.
flag: bool: flag = 1 means marginal grouping and flag = 0 means intersectional grouping

Returns:

fn: overall loss of all samples
fng: loss over group for every group in the training data
samples_fnr: False negative rate of every sample in the training data
gp_lens: length of each protected group

fomo.metrics.mce(estimator, X, y_true, num_bins=10)[source]¶

The metric to use if fairness is calibration-based. Returns maximum calibration error among the bins.

Parameters:

estimatorsklearn-like estimator: The underlying ML model to be trained.
Xarray-like, shape (n_samples, n_features): The training input samples.
y_true: array-like, bool: True labels.
num_bins: int: Number of bins that the predictions are sorted and partitioned into.

fomo.metrics.multicalibration_loss(estimator, X, y_true, groups=None, X_protected=None, grouping='intersectional', n_bins=None, bins=None, categories=None, proportional=False, alpha=0.01, gamma=0.01, rho=0.1, **kwargs)[source]¶: custom scoring function for multicalibration. calculate current loss in terms of (proportional) multicalibration

fomo.metrics.pairwise(iterable)[source]¶: s -> (s0,s1), (s1,s2), (s2, s3), …

fomo.metrics.stratify_groups(X, y, groups, n_bins=10, bins=None, alpha=0.0, gamma=0.0)[source]¶: Map data to an existing set of groups, stratified by risk interval.

fomo.metrics.subgroup_scorer(estimator, X, y_true, metric, grouping, abs_val, gamma, groups=None, X_protected=None, weights=None)[source]¶: Calculate the subgroup fairness of estimator on X according to `metric’. TODO: handle use case when Xp is passed