API

Estimators

class fomo.FomoClassifier(estimator: ~sklearn.base.ClassifierMixin = LogisticRegression(), fairness_metrics=None, accuracy_metrics=None, algorithm: ~pymoo.core.algorithm.Algorithm = <pymoo.algorithms.moo.nsga2.NSGA2 object>, random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint=False, picking_strategy='PseudoWeights')[source]

FOMO Classifier.

  1. Train a population of self.estimator models with random weights.

  2. Update sample weights using self.algorithm.

  3. Select a given model as best, but also save the set of models.

Parameters:
estimatorsklearn-like estimator

The underlying ML model to be trained. The ML model must accept sample_weight as an argument to fit().

fairness_metricslist[Callable]

The fairness metrics to try to optimize during fitting.

accuracy_metricslist[Callable]

The accuracy metrics to try to optimize during fitting.

algorithm: pymoo Algorithm

The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.

random_state: int | None

Random seed.

verbose: bool

Whether to print progress.

n_jobs: int

Number of parallel processes to use. Parallelizes evaluation.

store_final_models: bool

If True, the final set of models will be stored in the estimator.

problem_type: ElementwiseProblem

Determines the evaluation class to be used. Options: - BasicProblem - MLPProblem - LinearProblem

Examples

>>> from fomo import FomoClassifier
>>> from pmlb import pmlb
>>> X,y = pmlb.fetch_data('adult', return_X_y=True)
>>> groups = ['race','sex']
>>> est = FomoClassifier()
>>> est.fit(X,y, protected_features=groups)
fit(X, y, grouping='intersectional', abs_val=False, gamma=True, protected_features=None, Xp=None, **kwargs)[source]

Train the model.

  1. Train a population of self.estimator models with random weights.

  2. Update sample weights using self.algorithm.

  3. Select a given model as best, but also save the set of models.

Parameters:
Xarray-like, shape (n_samples, n_features)

The training input samples.

yarray-like, shape (n_samples,)

The target values. An array of int.

protected_features: list[str]|None, default = None

The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.

Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None

The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.

**kwargspassed to pymo.optimize.minimize.
Returns:
selfobject

Returns self.

predict(X)[source]

Predict labels from X.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
yndarray, shape (n_samples,)

The label for each sample is the label of the closest sample seen during fit.

predict_proba(X)[source]

Return prediction probabilities.

predict_proba_archive(X)[source]

Return a list of predictions from the archive models.

set_fit_request(*, Xp: bool | None | str = '$UNCHANGED$', abs_val: bool | None | str = '$UNCHANGED$', gamma: bool | None | str = '$UNCHANGED$', grouping: bool | None | str = '$UNCHANGED$', protected_features: bool | None | str = '$UNCHANGED$') FomoClassifier

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for Xp parameter in fit.

abs_valstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for abs_val parameter in fit.

gammastr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for gamma parameter in fit.

groupingstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for grouping parameter in fit.

protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for protected_features parameter in fit.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FomoClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

class fomo.FomoRegressor(estimator: ~sklearn.base.RegressorMixin = SGDRegressor(), fairness_metrics: list[str] = None, accuracy_metrics: list[str] = None, algorithm: str = 'NSGA2', random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint: bool = False, picking_strategy: str = 'PseudoWeights')[source]

Fomo class for regression models.

Parameters:
estimatorsklearn-like estimator

The underlying ML model to be trained. The ML model must accept sample_weight as an argument to fit().

fairness_metricslist[Callable]

The fairness metrics to try to optimize during fitting.

accuracy_metricslist[Callable]

The accuracy metrics to try to optimize during fitting.

algorithm: pymoo Algorithm

The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.

random_state: int | None

Random seed.

verbose: bool

Whether to print progress.

n_jobs: int

Number of parallel processes to use. Parallelizes evaluation.

store_final_models: bool

If True, the final set of models will be stored in the estimator.

problem_type: ElementwiseProblem

Determines the evaluation class to be used. Options:

  • BasicProblem: loss function weights are directly optimized.

  • MLPProblem: weights of a multilayer perceptron are optimized to estimate loss function weights.

  • LinearProblem: weights of a logistic model are optimized to estimate loss function weights.

Examples

>>> from fomo import FomoRegressor
>>> from pmlb import pmlb
>>> X,y = pmlb.fetch_data('adult', return_X_y=True)
>>> groups = ['race','sex']
>>> est = FomoRegressor()
>>> est.fit(X,y, protected_features=groups)
Attributes:
n_features_int

The number of features of the data passed to fit().

fit(X, y, protected_features=None, Xp=None, **kwargs)[source]

Train a set of regressors.

Parameters:
Xarray-like, shape (n_samples, n_features)

The training input samples.

yarray-like, shape (n_samples,)

The target values. An array of int.

protected_features: list|None, default = None

The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.

Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None

The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.

Returns:
selfobject

Returns self.

predict(X)[source]

Predict outcome.

Parameters:
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns:
yndarray, shape (n_samples,)

The label for each sample is the label of the closest sample seen during fit.

set_fit_request(*, Xp: bool | None | str = '$UNCHANGED$', protected_features: bool | None | str = '$UNCHANGED$') FomoRegressor

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for Xp parameter in fit.

protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for protected_features parameter in fit.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FomoRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

Problems

Fairness Oriented Multiobjective Optimization (Fomo)

BSD 3-Clause License

Copyright (c) 2023, William La Cava

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

class fomo.problem.BasicProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]

The evaluation function for each candidate sample weights.

Attributes:
n_constr

Methods

bounds

do

evaluate

get_sample_weight

has_bounds

has_constraints

ideal_point

nadir_point

name

pareto_front

pareto_set

class fomo.problem.InterLinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]

The evaluation function for each candidate weights.

Attributes:
n_constr

Methods

bounds

do

evaluate

get_sample_weight

has_bounds

has_constraints

ideal_point

nadir_point

name

pareto_front

pareto_set

class fomo.problem.LinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]

The evaluation function for each candidate weights.

Attributes:
n_constr

Methods

bounds

do

evaluate

get_sample_weight

has_bounds

has_constraints

ideal_point

nadir_point

name

pareto_front

pareto_set

class fomo.problem.MLPProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]

The evaluation function for each candidate weights.

Attributes:
n_constr

Methods

bounds

do

evaluate

get_sample_weight

has_bounds

has_constraints

ideal_point

nadir_point

name

pareto_front

pareto_set

class fomo.problem.SurrogateProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]

The evaluation function for each candidate weights.

Attributes:
n_constr

Methods

bounds

do

evaluate

get_sample_weight

has_bounds

has_constraints

ideal_point

nadir_point

name

pareto_front

pareto_set

Metrics

Fairness Oriented Multiobjective Optimization (Fomo)

BSD 3-Clause License

Copyright (c) 2023, William La Cava

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

fomo.metrics.FNR(y_true, y_pred)[source]

Returns False Negative Rate.

Parameters:
y_true: array-like, bool

True labels.

y_pred: array-like, float or bool

Predicted labels.

If y_pred is floats, this is the “soft” false negative rate
(i.e. the average probability estimate for the negative class)
fomo.metrics.FPR(y_true, y_pred)[source]

Returns False Positive Rate.

Parameters:
y_true: array-like, bool

True labels.

y_pred: array-like, float or bool

Predicted labels.

If y_pred is floats, this is the “soft” false positive rate
(i.e. the average probability estimate for the negative class)
fomo.metrics.TPR(y_true, y_pred)[source]

Returns True Positive Rate.

Parameters:
y_true: array-like, bool

True labels.

y_pred: array-like, float or bool

Predicted labels.

If y_pred is floats, this is the “soft” true positive rate
(i.e. the average probability estimate for the positive class)
fomo.metrics.differential_calibration_loss(estimator, X, y_true, groups=None, X_protected=None, n_bins=None, bins=None, stratified_categories=None, alpha=0.0, gamma=0.0, rho=0.0)[source]

Return the differential calibration of estimator on groups.

fomo.metrics.flex_loss(estimator, X, y_true, metric, **kwargs)[source]
Parameters:
estimatorsklearn-like estimator

The underlying ML model to be trained.

Xarray-like, shape (n_samples, n_features)

The training input samples.

y_true: array-like, bool

True labels.

metric: string or function

The loss function. Could be FPR or FNR.

flag: bool

flag = 1 means marginal grouping and flag = 0 means intersectional grouping

Returns:
fn: overall loss of all samples
fng: loss over group for every group in the training data
samples_fnr: False negative rate of every sample in the training data
gp_lens: length of each protected group
fomo.metrics.mce(estimator, X, y_true, num_bins=10)[source]

The metric to use if fairness is calibration-based. Returns maximum calibration error among the bins.

Parameters:
estimatorsklearn-like estimator

The underlying ML model to be trained.

Xarray-like, shape (n_samples, n_features)

The training input samples.

y_true: array-like, bool

True labels.

num_bins: int

Number of bins that the predictions are sorted and partitioned into.

fomo.metrics.multicalibration_loss(estimator, X, y_true, groups=None, X_protected=None, grouping='intersectional', n_bins=None, bins=None, categories=None, proportional=False, alpha=0.01, gamma=0.01, rho=0.1, **kwargs)[source]

custom scoring function for multicalibration. calculate current loss in terms of (proportional) multicalibration

fomo.metrics.pairwise(iterable)[source]

s -> (s0,s1), (s1,s2), (s2, s3), …

fomo.metrics.stratify_groups(X, y, groups, n_bins=10, bins=None, alpha=0.0, gamma=0.0)[source]

Map data to an existing set of groups, stratified by risk interval.

fomo.metrics.subgroup_scorer(estimator, X, y_true, metric, grouping, abs_val, gamma, groups=None, X_protected=None, weights=None)[source]

Calculate the subgroup fairness of estimator on X according to `metric’. TODO: handle use case when Xp is passed