API¶
Estimators¶
- class fomo.FomoClassifier(estimator: ~sklearn.base.ClassifierMixin = LogisticRegression(), fairness_metrics=None, accuracy_metrics=None, algorithm: ~pymoo.core.algorithm.Algorithm = <pymoo.algorithms.moo.nsga2.NSGA2 object>, random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint=False, picking_strategy='PseudoWeights')[source]¶
FOMO Classifier.
Train a population of self.estimator models with random weights.
Update sample weights using self.algorithm.
Select a given model as best, but also save the set of models.
- Parameters:
- estimatorsklearn-like estimator
The underlying ML model to be trained. The ML model must accept sample_weight as an argument to
fit()
.- fairness_metricslist[Callable]
The fairness metrics to try to optimize during fitting.
- accuracy_metricslist[Callable]
The accuracy metrics to try to optimize during fitting.
- algorithm: pymoo Algorithm
The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.
- random_state: int | None
Random seed.
- verbose: bool
Whether to print progress.
- n_jobs: int
Number of parallel processes to use. Parallelizes evaluation.
- store_final_models: bool
If True, the final set of models will be stored in the estimator.
- problem_type: ElementwiseProblem
Determines the evaluation class to be used. Options: -
BasicProblem
-MLPProblem
-LinearProblem
Examples
>>> from fomo import FomoClassifier >>> from pmlb import pmlb >>> X,y = pmlb.fetch_data('adult', return_X_y=True) >>> groups = ['race','sex'] >>> est = FomoClassifier() >>> est.fit(X,y, protected_features=groups)
- fit(X, y, grouping='intersectional', abs_val=False, gamma=True, protected_features=None, Xp=None, **kwargs)[source]¶
Train the model.
Train a population of self.estimator models with random weights.
Update sample weights using self.algorithm.
Select a given model as best, but also save the set of models.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
The training input samples.
- yarray-like, shape (n_samples,)
The target values. An array of int.
- protected_features: list[str]|None, default = None
The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.
- Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None
The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.
- **kwargspassed to pymo.optimize.minimize.
- Returns:
- selfobject
Returns self.
- predict(X)[source]¶
Predict labels from X.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
The input samples.
- Returns:
- yndarray, shape (n_samples,)
The label for each sample is the label of the closest sample seen during fit.
- set_fit_request(*, Xp: bool | None | str = '$UNCHANGED$', abs_val: bool | None | str = '$UNCHANGED$', gamma: bool | None | str = '$UNCHANGED$', grouping: bool | None | str = '$UNCHANGED$', protected_features: bool | None | str = '$UNCHANGED$') FomoClassifier ¶
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
Xp
parameter infit
.- abs_valstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
abs_val
parameter infit
.- gammastr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
gamma
parameter infit
.- groupingstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
grouping
parameter infit
.- protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
protected_features
parameter infit
.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FomoClassifier ¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weight
parameter inscore
.
- Returns:
- selfobject
The updated object.
- class fomo.FomoRegressor(estimator: ~sklearn.base.RegressorMixin = SGDRegressor(), fairness_metrics: list[str] = None, accuracy_metrics: list[str] = None, algorithm: str = 'NSGA2', random_state: int = None, verbose: bool = False, n_jobs: int = -1, store_final_models: bool = False, problem_type=<class 'fomo.problem.BasicProblem'>, checkpoint: bool = False, picking_strategy: str = 'PseudoWeights')[source]¶
Fomo class for regression models.
- Parameters:
- estimatorsklearn-like estimator
The underlying ML model to be trained. The ML model must accept sample_weight as an argument to
fit()
.- fairness_metricslist[Callable]
The fairness metrics to try to optimize during fitting.
- accuracy_metricslist[Callable]
The accuracy metrics to try to optimize during fitting.
- algorithm: pymoo Algorithm
The multi-objective optimizer to use. Should be compatible with pymoo.core.algorithm.Algorithm.
- random_state: int | None
Random seed.
- verbose: bool
Whether to print progress.
- n_jobs: int
Number of parallel processes to use. Parallelizes evaluation.
- store_final_models: bool
If True, the final set of models will be stored in the estimator.
- problem_type: ElementwiseProblem
Determines the evaluation class to be used. Options:
BasicProblem
: loss function weights are directly optimized.MLPProblem
: weights of a multilayer perceptron are optimized to estimate loss function weights.LinearProblem
: weights of a logistic model are optimized to estimate loss function weights.
Examples
>>> from fomo import FomoRegressor >>> from pmlb import pmlb >>> X,y = pmlb.fetch_data('adult', return_X_y=True) >>> groups = ['race','sex'] >>> est = FomoRegressor() >>> est.fit(X,y, protected_features=groups)
- Attributes:
- n_features_int
The number of features of the data passed to
fit()
.
- fit(X, y, protected_features=None, Xp=None, **kwargs)[source]¶
Train a set of regressors.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
The training input samples.
- yarray-like, shape (n_samples,)
The target values. An array of int.
- protected_features: list|None, default = None
The columns of X to calculate fairness over. If specifying columns, do not also specify Xp.
- Xp: pandas DataFrame, shape (n_samples, n_protected_features), default=None
The protected feature values used to calculate fairness. If Xp is specified, protected_features must be None.
- Returns:
- selfobject
Returns self.
- predict(X)[source]¶
Predict outcome.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
The input samples.
- Returns:
- yndarray, shape (n_samples,)
The label for each sample is the label of the closest sample seen during fit.
- set_fit_request(*, Xp: bool | None | str = '$UNCHANGED$', protected_features: bool | None | str = '$UNCHANGED$') FomoRegressor ¶
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Xpstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
Xp
parameter infit
.- protected_featuresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
protected_features
parameter infit
.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FomoRegressor ¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weight
parameter inscore
.
- Returns:
- selfobject
The updated object.
Problems¶
Fairness Oriented Multiobjective Optimization (Fomo)
BSD 3-Clause License
Copyright (c) 2023, William La Cava
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- class fomo.problem.BasicProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶
The evaluation function for each candidate sample weights.
- Attributes:
- n_constr
Methods
bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set
- class fomo.problem.InterLinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶
The evaluation function for each candidate weights.
- Attributes:
- n_constr
Methods
bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set
- class fomo.problem.LinearProblem(fomo_estimator, metric_kwargs={}, **kwargs)[source]¶
The evaluation function for each candidate weights.
- Attributes:
- n_constr
Methods
bounds
do
evaluate
get_sample_weight
has_bounds
has_constraints
ideal_point
nadir_point
name
pareto_front
pareto_set
Metrics¶
Fairness Oriented Multiobjective Optimization (Fomo)
BSD 3-Clause License
Copyright (c) 2023, William La Cava
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- fomo.metrics.FNR(y_true, y_pred)[source]¶
Returns False Negative Rate.
- Parameters:
- y_true: array-like, bool
True labels.
- y_pred: array-like, float or bool
Predicted labels.
- If y_pred is floats, this is the “soft” false negative rate
- (i.e. the average probability estimate for the negative class)
- fomo.metrics.FPR(y_true, y_pred)[source]¶
Returns False Positive Rate.
- Parameters:
- y_true: array-like, bool
True labels.
- y_pred: array-like, float or bool
Predicted labels.
- If y_pred is floats, this is the “soft” false positive rate
- (i.e. the average probability estimate for the negative class)
- fomo.metrics.TPR(y_true, y_pred)[source]¶
Returns True Positive Rate.
- Parameters:
- y_true: array-like, bool
True labels.
- y_pred: array-like, float or bool
Predicted labels.
- If y_pred is floats, this is the “soft” true positive rate
- (i.e. the average probability estimate for the positive class)
- fomo.metrics.differential_calibration_loss(estimator, X, y_true, groups=None, X_protected=None, n_bins=None, bins=None, stratified_categories=None, alpha=0.0, gamma=0.0, rho=0.0)[source]¶
Return the differential calibration of estimator on groups.
- fomo.metrics.flex_loss(estimator, X, y_true, metric, **kwargs)[source]¶
- Parameters:
- estimatorsklearn-like estimator
The underlying ML model to be trained.
- Xarray-like, shape (n_samples, n_features)
The training input samples.
- y_true: array-like, bool
True labels.
- metric: string or function
The loss function. Could be FPR or FNR.
- flag: bool
flag = 1 means marginal grouping and flag = 0 means intersectional grouping
- Returns:
- fn: overall loss of all samples
- fng: loss over group for every group in the training data
- samples_fnr: False negative rate of every sample in the training data
- gp_lens: length of each protected group
- fomo.metrics.mce(estimator, X, y_true, num_bins=10)[source]¶
The metric to use if fairness is calibration-based. Returns maximum calibration error among the bins.
- Parameters:
- estimatorsklearn-like estimator
The underlying ML model to be trained.
- Xarray-like, shape (n_samples, n_features)
The training input samples.
- y_true: array-like, bool
True labels.
- num_bins: int
Number of bins that the predictions are sorted and partitioned into.
- fomo.metrics.multicalibration_loss(estimator, X, y_true, groups=None, X_protected=None, grouping='intersectional', n_bins=None, bins=None, categories=None, proportional=False, alpha=0.01, gamma=0.01, rho=0.1, **kwargs)[source]¶
custom scoring function for multicalibration. calculate current loss in terms of (proportional) multicalibration