Python API

class feat. Feat ( pop_size = 100 , gens = 100 , ml = 'LinearRidgeRegression' , classification = False , verbosity = 0 , max_stall = 0 , sel = 'lexicase' , surv = 'nsga2' , cross_rate = 0.5 , root_xo_rate = 0.5 , otype = 'a' , functions = ['+', '-', '*', '/', '^2', '^3', 'sqrt', 'sin', 'cos', 'exp', 'log', '^', 'logit', 'tanh', 'gauss', 'relu', 'split', 'split_c', 'b2f', 'c2f', 'and', 'or', 'not', 'xor', '=', '<', '<=', '>', '>=', 'if', 'ite'] , max_depth = 3 , max_dim = 10 , random_state = 0 , erc = False , objectives = ['fitness', 'complexity'] , shuffle = True , split = 0.75 , fb = 0.5 , scorer = '' , feature_names = '' , backprop = False , iters = 10 , lr = 0.1 , batch_size = 0 , n_jobs = 1 , hillclimb = False , logfile = '' , max_time = -1 , residual_xo = False , stagewise_xo = False , stagewise_xo_tol = False , softmax_norm = False , save_pop = 0 , normalize = True , val_from_arch = True , corr_delete_mutate = False , simplify = 0.0 , protected_groups = '' , tune_initial = False , tune_final = True , starting_pop = '' ) [source]

Feature Engineering Automation Tool

Parameters :
  • pop_size ( int , optional ( default: 100 ) ) – Size of the population of models

  • gens ( int , optional ( default: 100 ) ) – Number of iterations to train for

  • ml ( str , optional ( default: "LinearRidgeRegression" ) ) – ML pairing. Choices: LinearRidgeRegression, Lasso, L1_LR, L2_LR FeatRegressor sets to “LinearRidgeRegression”; FeatClassifier sets to L2 penalized LR (“LR”)

  • classification ( boolean or None , optional ( default: None ) ) – Whether to do classification or regression. Set explicitly in FeatRegressor and FeatClassifier accordingly.

  • verbosity ( int , optional ( default: 0 ) ) – How much to print out (0, 1, 2)

  • max_stall ( int , optional ( default: 0 ) ) – How many generations to continue after the validation loss has stalled. If 0, not used.

  • sel ( str , optional ( default: "lexicase" ) ) – Selection algorithm to use.

  • surv ( str , optional ( default: "nsga2" ) ) – Survival algorithm to use.

  • cross_rate ( float , optional ( default: 0.5 ) ) – How often to do crossover for variation versus mutation.

  • root_xo_rate ( float , optional ( default: 0.5 ) ) – When performing crossover, how often to choose from the roots of the trees, rather than within the tree. Root crossover essentially swaps features in models.

  • otype ( string , optional ( default: 'a' ) ) – Feature output types: ‘a’: all ‘b’: boolean only ‘f’: floating point only

  • functions ( list [ string ] , optional ( default: [ ] ) ) – A comma-separated string of operators to use to build features. If functions=[], all the available functions are used. Options: +, -, * , /, ^2, ^3, sqrt, sin, cos, exp, log, ^, logit, tanh, gauss, relu, split, split_c, b2f, c2f, and, or, not, xor, =, <, <=, >, >=, if, ite

  • max_depth ( int , optional ( default: 3 ) ) – Maximum depth of a feature’s tree representation.

  • max_dim ( int , optional ( default: 10 ) ) – Maximum dimension of a model. The dimension of a model is how many independent features it has. Controls the number of trees in each individual.

  • random_state ( int , optional ( default: 0 ) ) – Random seed. If -1, will choose a random random_state.

  • erc ( boolean , optional ( default: False ) ) – If true, ephemeral random constants are included as nodes in trees.

  • objectives ( list [ str ] , optional ( default: "fitness , complexity" ) ) – Objectives to use for multi-objective optimization.

  • shuffle ( boolean , optional ( default: True ) ) – Whether to shuffle the training data before beginning training.

  • split ( float , optional ( default: 0.75 ) ) – The internal fraction of training data to use. The validation fold is then 1-split fraction of the data.

  • fb ( float , optional ( default: 0.5 ) ) – Controls the amount of feedback from the ML weights used during variation. Higher values make variation less random.

  • scorer ( str , optional ( default: '' ) ) – Scoring function to use internally.

  • feature_names ( str , optional ( default: '' ) ) – Optionally provide comma-separated feature names. Should be equal to the number of features in your data. This will be set automatically if a Pandas dataframe is passed to fit().

  • backprop ( boolean , optional ( default: False ) ) – Perform gradient descent on feature weights using backpropagation.

  • iters ( int , optional ( default: 10 ) ) – Controls the number of iterations of backprop as well as hillclimbing for learning weights.

  • lr ( float , optional ( default: 0.1 ) ) – Learning rate used for gradient descent. This the initial rate, and is scheduled to decrease exponentially with generations.

  • batch_size ( int , optional ( default: 0 ) ) – Number of samples to train on each generation. 0 means train on all the samples.

  • n_jobs ( int , optional ( default: 0 ) ) – Number of parallel threads to use. If 0, this will be automatically determined by OMP.

  • hillclimb ( boolean , optional ( default: False ) ) – Applies stochastic hillclimbing to feature weights.

  • logfile ( str , optional ( default: "" ) ) – If specified, spits statistics into a logfile. “” means don’t log.

  • max_time ( int , optional ( default: -1 ) ) – Maximum time terminational criterion in seconds. If -1, not used.

  • residual_xo ( boolean , optional ( default: False ) ) – Use residual crossover.

  • stagewise_xo ( boolean , optional ( default: False ) ) – Use stagewise crossover.

  • stagewise_xo_tol ( boolean , optional ( default:False ) ) – Terminates stagewise crossover based on an error value rather than dimensionality.

  • softmax_norm ( boolean , optional ( default: False ) ) – Uses softmax normalization of probabilities of variation across the features.

  • save_pop ( int , optional ( default: 0 ) ) – Saves the population of models. 0: don’t save; 1: save final population; 2: save every generation.

  • normalize ( boolean , optional ( default: True ) ) – Normalizes the floating point input variables using z-scores.

  • val_from_arch ( boolean , optional ( default: True ) ) – Validates the final model using the archive rather than the whole population.

  • corr_delete_mutate ( boolean , optional ( default: False ) ) – Replaces root deletion mutation with a deterministic deletion operator that deletes the feature with highest collinearity.

  • simplify ( float , optional ( default: 0 ) ) – Runs post-run simplification to try to shrink the final model without changing its output more than the simplify tolerance. This tolerance is the norm of the difference in outputs, divided by the norm of the output. If simplify=0, it is ignored.

  • protected_groups ( list , optional ( default: [ ] ) ) – Defines protected attributes in the data. Uses for adding fairness constraints.

  • tune_initial ( boolean , optional ( default: False ) ) – Tune the initial linear model’s penalization parameter.

  • tune_final ( boolean , optional ( default: True ) ) – Tune the final linear model’s penalization parameter.

  • starting_pop ( str , optional ( default: "" ) ) – Provide a starting pop in json format.

fit ( X , y , Z = None ) [source]

Fit a model.

fit_predict ( X , y , Z = None ) [source]

Convenience method that runs fit(X,y) then predict(X)

fit_transform ( X , y , Z = None ) [source]

Convenience method that runs fit(X,y) then transform(X)

load ( filename ) [source]

Load a saved Feat state from file.

predict ( X , Z = None ) [source]

Predict on X.

predict_archive ( X , Z = None , front = False ) [source]

Returns a list of dictionary predictions for all models.

save ( filename ) [source]

Save a Feat state to file.

score ( X , y , Z = None ) [source]

Returns a score for the predictions of Feat on X versus true labels y

set_fit_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source]

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config() ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True : metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False : metadata is not requested and the meta-estimator will not pass it to fit .

  • None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters :

Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for Z parameter in fit .

Returns :

self – The updated object.

Return type :

object

set_predict_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source]

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config() ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True : metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False : metadata is not requested and the meta-estimator will not pass it to predict .

  • None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters :

Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for Z parameter in predict .

Returns :

self – The updated object.

Return type :

object

set_score_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source]

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config() ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True : metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False : metadata is not requested and the meta-estimator will not pass it to score .

  • None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters :

Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for Z parameter in score .

Returns :

self – The updated object.

Return type :

object

set_transform_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source]

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config() ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True : metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False : metadata is not requested and the meta-estimator will not pass it to transform .

  • None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline . Otherwise it has no effect.

Parameters :

Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for Z parameter in transform .

Returns :

self – The updated object.

Return type :

object

transform ( X , Z = None ) [source]

Return the representation’s transformation of X