Python API ¶
- class feat. Feat ( pop_size = 100 , gens = 100 , ml = 'LinearRidgeRegression' , classification = False , verbosity = 0 , max_stall = 0 , sel = 'lexicase' , surv = 'nsga2' , cross_rate = 0.5 , root_xo_rate = 0.5 , otype = 'a' , functions = ['+', '-', '*', '/', '^2', '^3', 'sqrt', 'sin', 'cos', 'exp', 'log', '^', 'logit', 'tanh', 'gauss', 'relu', 'split', 'split_c', 'b2f', 'c2f', 'and', 'or', 'not', 'xor', '=', '<', '<=', '>', '>=', 'if', 'ite'] , max_depth = 3 , max_dim = 10 , random_state = 0 , erc = False , objectives = ['fitness', 'complexity'] , shuffle = True , split = 0.75 , fb = 0.5 , scorer = '' , feature_names = '' , backprop = False , iters = 10 , lr = 0.1 , batch_size = 0 , n_jobs = 1 , hillclimb = False , logfile = '' , max_time = -1 , residual_xo = False , stagewise_xo = False , stagewise_xo_tol = False , softmax_norm = False , save_pop = 0 , normalize = True , val_from_arch = True , corr_delete_mutate = False , simplify = 0.0 , protected_groups = '' , tune_initial = False , tune_final = True , starting_pop = '' ) [source] ¶
-
Feature Engineering Automation Tool
- Parameters :
-
-
pop_size ( int , optional ( default: 100 ) ) – Size of the population of models
-
gens ( int , optional ( default: 100 ) ) – Number of iterations to train for
-
ml ( str , optional ( default: "LinearRidgeRegression" ) ) – ML pairing. Choices: LinearRidgeRegression, Lasso, L1_LR, L2_LR FeatRegressor sets to “LinearRidgeRegression”; FeatClassifier sets to L2 penalized LR (“LR”)
-
classification ( boolean or None , optional ( default: None ) ) – Whether to do classification or regression. Set explicitly in FeatRegressor and FeatClassifier accordingly.
-
verbosity ( int , optional ( default: 0 ) ) – How much to print out (0, 1, 2)
-
max_stall ( int , optional ( default: 0 ) ) – How many generations to continue after the validation loss has stalled. If 0, not used.
-
sel ( str , optional ( default: "lexicase" ) ) – Selection algorithm to use.
-
surv ( str , optional ( default: "nsga2" ) ) – Survival algorithm to use.
-
cross_rate ( float , optional ( default: 0.5 ) ) – How often to do crossover for variation versus mutation.
-
root_xo_rate ( float , optional ( default: 0.5 ) ) – When performing crossover, how often to choose from the roots of the trees, rather than within the tree. Root crossover essentially swaps features in models.
-
otype ( string , optional ( default: 'a' ) ) – Feature output types: ‘a’: all ‘b’: boolean only ‘f’: floating point only
-
functions ( list [ string ] , optional ( default: [ ] ) ) – A comma-separated string of operators to use to build features. If functions=[], all the available functions are used. Options: +, -, * , /, ^2, ^3, sqrt, sin, cos, exp, log, ^, logit, tanh, gauss, relu, split, split_c, b2f, c2f, and, or, not, xor, =, <, <=, >, >=, if, ite
-
max_depth ( int , optional ( default: 3 ) ) – Maximum depth of a feature’s tree representation.
-
max_dim ( int , optional ( default: 10 ) ) – Maximum dimension of a model. The dimension of a model is how many independent features it has. Controls the number of trees in each individual.
-
random_state ( int , optional ( default: 0 ) ) – Random seed. If -1, will choose a random random_state.
-
erc ( boolean , optional ( default: False ) ) – If true, ephemeral random constants are included as nodes in trees.
-
objectives ( list [ str ] , optional ( default: "fitness , complexity" ) ) – Objectives to use for multi-objective optimization.
-
shuffle ( boolean , optional ( default: True ) ) – Whether to shuffle the training data before beginning training.
-
split ( float , optional ( default: 0.75 ) ) – The internal fraction of training data to use. The validation fold is then 1-split fraction of the data.
-
fb ( float , optional ( default: 0.5 ) ) – Controls the amount of feedback from the ML weights used during variation. Higher values make variation less random.
-
scorer ( str , optional ( default: '' ) ) – Scoring function to use internally.
-
feature_names ( str , optional ( default: '' ) ) – Optionally provide comma-separated feature names. Should be equal to the number of features in your data. This will be set automatically if a Pandas dataframe is passed to fit().
-
backprop ( boolean , optional ( default: False ) ) – Perform gradient descent on feature weights using backpropagation.
-
iters ( int , optional ( default: 10 ) ) – Controls the number of iterations of backprop as well as hillclimbing for learning weights.
-
lr ( float , optional ( default: 0.1 ) ) – Learning rate used for gradient descent. This the initial rate, and is scheduled to decrease exponentially with generations.
-
batch_size ( int , optional ( default: 0 ) ) – Number of samples to train on each generation. 0 means train on all the samples.
-
n_jobs ( int , optional ( default: 0 ) ) – Number of parallel threads to use. If 0, this will be automatically determined by OMP.
-
hillclimb ( boolean , optional ( default: False ) ) – Applies stochastic hillclimbing to feature weights.
-
logfile ( str , optional ( default: "" ) ) – If specified, spits statistics into a logfile. “” means don’t log.
-
max_time ( int , optional ( default: -1 ) ) – Maximum time terminational criterion in seconds. If -1, not used.
-
residual_xo ( boolean , optional ( default: False ) ) – Use residual crossover.
-
stagewise_xo ( boolean , optional ( default: False ) ) – Use stagewise crossover.
-
stagewise_xo_tol ( boolean , optional ( default:False ) ) – Terminates stagewise crossover based on an error value rather than dimensionality.
-
softmax_norm ( boolean , optional ( default: False ) ) – Uses softmax normalization of probabilities of variation across the features.
-
save_pop ( int , optional ( default: 0 ) ) – Saves the population of models. 0: don’t save; 1: save final population; 2: save every generation.
-
normalize ( boolean , optional ( default: True ) ) – Normalizes the floating point input variables using z-scores.
-
val_from_arch ( boolean , optional ( default: True ) ) – Validates the final model using the archive rather than the whole population.
-
corr_delete_mutate ( boolean , optional ( default: False ) ) – Replaces root deletion mutation with a deterministic deletion operator that deletes the feature with highest collinearity.
-
simplify ( float , optional ( default: 0 ) ) – Runs post-run simplification to try to shrink the final model without changing its output more than the simplify tolerance. This tolerance is the norm of the difference in outputs, divided by the norm of the output. If simplify=0, it is ignored.
-
protected_groups ( list , optional ( default: [ ] ) ) – Defines protected attributes in the data. Uses for adding fairness constraints.
-
tune_initial ( boolean , optional ( default: False ) ) – Tune the initial linear model’s penalization parameter.
-
tune_final ( boolean , optional ( default: True ) ) – Tune the final linear model’s penalization parameter.
-
starting_pop ( str , optional ( default: "" ) ) – Provide a starting pop in json format.
-
- fit_transform ( X , y , Z = None ) [source] ¶
-
Convenience method that runs fit(X,y) then transform(X)
- predict_archive ( X , Z = None , front = False ) [source] ¶
-
Returns a list of dictionary predictions for all models.
- score ( X , y , Z = None ) [source] ¶
-
Returns a score for the predictions of Feat on X versus true labels y
- set_fit_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source] ¶
-
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
-
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided. -
False
: metadata is not requested and the meta-estimator will not pass it tofit
. -
None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters :
-
Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for
Z
parameter infit
. - Returns :
-
self – The updated object.
- Return type :
-
object
-
- set_predict_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source] ¶
-
Request metadata passed to the
predict
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
-
True
: metadata is requested, and passed topredict
if provided. The request is ignored if metadata is not provided. -
False
: metadata is not requested and the meta-estimator will not pass it topredict
. -
None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters :
-
Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for
Z
parameter inpredict
. - Returns :
-
self – The updated object.
- Return type :
-
object
-
- set_score_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source] ¶
-
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
-
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided. -
False
: metadata is not requested and the meta-estimator will not pass it toscore
. -
None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters :
-
Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for
Z
parameter inscore
. - Returns :
-
self – The updated object.
- Return type :
-
object
-
- set_transform_request ( * , Z : bool | None | str = '$UNCHANGED$' ) Feat [source] ¶
-
Request metadata passed to the
transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
-
True
: metadata is requested, and passed totransform
if provided. The request is ignored if metadata is not provided. -
False
: metadata is not requested and the meta-estimator will not pass it totransform
. -
None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters :
-
Z ( str , True , False , or None , default=sklearn.utils.metadata_routing.UNCHANGED ) – Metadata routing for
Z
parameter intransform
. - Returns :
-
self – The updated object.
- Return type :
-
object
-