Brush C++ API
A flexible interpretable machine learning framework
Loading...
Searching...
No Matches
Brush::SearchSpace Struct Reference

Holds a search space, consisting of operations and terminals and functions, and methods to sample that space to create programs. More...

#include <search_space.h>

Collaboration diagram for Brush::SearchSpace:

Public Types

using ArgsHash = std::size_t
 
template<typename T >
using Map
 

Public Member Functions

template<typename PT >
PT make_program (const Parameters &params, int max_d=0, int max_size=0)
 Makes a random program.
 
RegressorProgram make_regressor (int max_d=0, int max_size=0, const Parameters &params=Parameters())
 Makes a random regressor program. Convenience wrapper for make_program.
 
ClassifierProgram make_classifier (int max_d=0, int max_size=0, const Parameters &params=Parameters())
 Makes a random classifier program. Convenience wrapper for make_program.
 
MulticlassClassifierProgram make_multiclass_classifier (int max_d=0, int max_size=0, const Parameters &params=Parameters())
 Makes a random multiclass classifier program. Convenience wrapper for make_program.
 
RepresenterProgram make_representer (int max_d=0, int max_size=0, const Parameters &params=Parameters())
 Makes a random representer program. Convenience wrapper for make_program.
 
 SearchSpace ()=default
 
 SearchSpace (const Dataset &d, const unordered_map< string, float > &user_ops={}, bool weights_init=true)
 Construct a search space.
 
void init (const Dataset &d, const unordered_map< string, float > &user_ops={}, bool weights_init=true)
 Called by the constructor to initialize the search space.
 
bool check (DataType R) const
 check if a return type is in the node map
 
bool check (DataType R, size_t sig_hash) const
 check if a function signature is in the search space
 
bool check (DataType R, size_t sig_hash, NodeType type) const
 check if a typed Node is in the search space
 
template<typename Iter >
bool has_solution_space (Iter start, Iter end) const
 Takes iterators to weight vectors and checks if they have a non-empty solution space. An empty solution space is defined as having no non-zero, positive values.
 
template<typename F >
Node get (const string &name)
 
Node get (NodeType type, DataType R, size_t sig_hash)
 get a typed node
 
template<typename S >
Node get (NodeType type, DataType R, S sig)
 get a typed node.
 
vector< floatget_weights () const
 get weights of the return types
 
vector< floatget_weights (DataType ret) const
 get weights of the argument types matching return type ret.
 
vector< floatget_weights (DataType ret, ArgsHash sig_hash) const
 get the weights of nodes matching a signature.
 
std::optional< Nodesample_terminal (bool force_return=false) const
 Get a random terminal.
 
std::optional< Nodesample_terminal (DataType R, bool force_return=false) const
 Get a random terminal with return type R
 
std::optional< Nodesample_op (DataType ret) const
 get an operator matching return type ret.
 
std::optional< Nodesample_op (NodeType type, DataType R)
 Get a specific node type that matches a return value.
 
std::optional< Nodesample_op_with_arg (DataType ret, DataType arg, bool terminal_compatible=true, int max_args=0) const
 get operator with at least one argument matching arg
 
std::optional< Nodeget_node_like (Node node) const
 get a node with a signature matching node
 
std::optional< tree< Node > > sample_subtree (Node root, int max_d, int max_size) const
 create a subtree with maximum size and depth restrictions and root of type root_type
 
void print () const
 prints the search space map.
 
template<typename P >
P make_program (const Parameters &params, int max_d, int max_size)
 

Public Attributes

Map< Nodenode_map
 Maps return types to argument types to node types.
 
Map< floatnode_map_weights
 A map of weights corresponding to elements in node_map, used to weight probabilities of each node being sampled from the map.
 
unordered_map< DataType, vector< Node > > terminal_map
 Maps return types to terminals.
 
unordered_map< DataType, vector< float > > terminal_weights
 A map of weights corresponding to elements in terminal_map, used to weight probabilities of each terminal being sampled from the map.
 
vector< DataTypeterminal_types
 A vector storing the available return types of terminals.
 

Private Member Functions

tree< Node > & PTC2 (tree< Node > &Tree, tree< Node >::iterator root, int max_d, int max_size) const
 
template<NodeType NT, typename S >
constexpr void AddNode (const unordered_map< string, float > &user_ops, const vector< DataType > &unique_data_types)
 
template<NodeType NT, typename Sigs , std::size_t... Is>
constexpr void AddNodes (const unordered_map< string, float > &user_ops, const vector< DataType > &unique_data_types, std::index_sequence< Is... >)
 
template<NodeType NT>
void MakeNodes (const unordered_map< string, float > &user_ops, const vector< DataType > &unique_data_types)
 
template<std::size_t... Is>
void GenerateNodeMap (const unordered_map< string, float > &user_ops, const vector< DataType > &unique_data_types, std::index_sequence< Is... >)
 

Static Private Member Functions

template<NodeType NT, typename S >
requires (!is_in_v<NT, NodeType::Terminal, NodeType::Constant, NodeType::MeanLabel>)
static constexpr std::optional< NodeCreateNode (const auto &unique_data_types, bool use_all, bool weighted)
 

Detailed Description

Holds a search space, consisting of operations and terminals and functions, and methods to sample that space to create programs.

The set of operators is a user controlled parameter; however, we can automate, to some extent, the set of possible operators based on the data types in the problem. Constraints on operators based on data types:

  • only user specified operators are included.
  • operators whose arguments are covered by terminal types are included first. Then, a second pass includes any operators whose arguments are covered by terminal_types + return types of the current set of operators. One could imagine this continuing ad infinitum, but we just do two passes for simplicity.
  • assertion check to make sure there is at least one operator that returns the output type of the model.

When sampling in the search space (using any of the sampling functions sample_op or sample_terminal), some methods can fail to return a value — given a specific set of parameters to a function, the candidate solutions set may be empty — and, for these methods, the return type is either a valid value, or a std::nullopt. This is controlled wrapping the return type with std::optional.

Parameters

Definition at line 83 of file search_space.h.

Member Typedef Documentation

◆ ArgsHash

Definition at line 85 of file search_space.h.

◆ Map

Initial value:
unordered_map<DataType,
unordered_map<ArgsHash,
unordered_map<NodeType,
T>>>
NodeType
Definition nodetype.h:31
DataType
data types.
Definition types.h:143
std::size_t ArgsHash

Definition at line 88 of file search_space.h.

Constructor & Destructor Documentation

◆ SearchSpace() [1/2]

Brush::SearchSpace::SearchSpace ( )
default

◆ SearchSpace() [2/2]

Brush::SearchSpace::SearchSpace ( const Dataset & d,
const unordered_map< string, float > & user_ops = {},
bool weights_init = true )
inline

Construct a search space.

Parameters
dA dataset containing terminal definitions
user_opsOptional user-provided dictionary of operators with their probability of being chosen
weights_initwhether the terminal prob_change should be estimated from correlations with the target value

Definition at line 181 of file search_space.h.

Member Function Documentation

◆ AddNode()

template<NodeType NT, typename S >
constexpr void Brush::SearchSpace::AddNode ( const unordered_map< string, float > & user_ops,
const vector< DataType > & unique_data_types )
inlineconstexprprivate

Definition at line 604 of file search_space.h.

Here is the call graph for this function:

◆ AddNodes()

template<NodeType NT, typename Sigs , std::size_t... Is>
constexpr void Brush::SearchSpace::AddNodes ( const unordered_map< string, float > & user_ops,
const vector< DataType > & unique_data_types,
std::index_sequence< Is... >  )
inlineconstexprprivate

Definition at line 628 of file search_space.h.

Here is the call graph for this function:

◆ check() [1/3]

bool Brush::SearchSpace::check ( DataType R) const
inline

check if a return type is in the node map

Parameters
Rdata type
Returns
true if it exists

Definition at line 194 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ check() [2/3]

bool Brush::SearchSpace::check ( DataType R,
size_t sig_hash ) const
inline

check if a function signature is in the search space

Parameters
Rreturn type
sig_hashsignature hash
Returns
true if it exists

Definition at line 206 of file search_space.h.

Here is the call graph for this function:

◆ check() [3/3]

bool Brush::SearchSpace::check ( DataType R,
size_t sig_hash,
NodeType type ) const
inline

check if a typed Node is in the search space

Parameters
Rreturn type
sig_hashsignature hash
typethe node type
Returns
true if it exists

Definition at line 222 of file search_space.h.

Here is the call graph for this function:

◆ CreateNode()

static constexpr std::optional< Node > Brush::SearchSpace::CreateNode ( const auto & unique_data_types,
bool use_all,
bool weighted )
inlinestaticconstexprprivate

Definition at line 586 of file search_space.h.

Here is the call graph for this function:

◆ GenerateNodeMap()

template<std::size_t... Is>
void Brush::SearchSpace::GenerateNodeMap ( const unordered_map< string, float > & user_ops,
const vector< DataType > & unique_data_types,
std::index_sequence< Is... >  )
inlineprivate

Definition at line 659 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get() [1/3]

template<typename F >
Node Brush::SearchSpace::get ( const string & name)
Here is the caller graph for this function:

◆ get() [2/3]

template<typename S >
Node Brush::SearchSpace::get ( NodeType type,
DataType R,
S sig )
inline

get a typed node.

Template Parameters
Sthe signature of the node, inferred.
Parameters
typethe node type
Rthe return type of the node
sigthe signature of the node
Returns
the matching Node

Definition at line 265 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get() [3/3]

Node Brush::SearchSpace::get ( NodeType type,
DataType R,
size_t sig_hash )
inline

get a typed node

Parameters
typethe node type
Rthe return type of the node
sig_hashthe signature hash of the node
Returns
the matching Node

Definition at line 252 of file search_space.h.

Here is the call graph for this function:

◆ get_node_like()

std::optional< Node > Brush::SearchSpace::get_node_like ( Node node) const
inline

get a node with a signature matching node

Parameters
nodethe node to match
Returns
std::optional that may contain a Node

Definition at line 550 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_weights() [1/3]

vector< float > Brush::SearchSpace::get_weights ( ) const
inline

get weights of the return types

Returns
a weight vector, each element corresponding to a return type.

Definition at line 269 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_weights() [2/3]

vector< float > Brush::SearchSpace::get_weights ( DataType ret) const
inline

get weights of the argument types matching return type ret.

Parameters
retreturn type
Returns
a weight vector, each element corresponding to an args type.

Definition at line 289 of file search_space.h.

Here is the call graph for this function:

◆ get_weights() [3/3]

vector< float > Brush::SearchSpace::get_weights ( DataType ret,
ArgsHash sig_hash ) const
inline

get the weights of nodes matching a signature.

Parameters
retreturn type
sig_hashsignature hash
Returns
a weight vector, each element corresponding to a node.

Definition at line 308 of file search_space.h.

Here is the call graph for this function:

◆ has_solution_space()

template<typename Iter >
bool Brush::SearchSpace::has_solution_space ( Iter start,
Iter end ) const
inline

Takes iterators to weight vectors and checks if they have a non-empty solution space. An empty solution space is defined as having no non-zero, positive values.

Template Parameters
Ttype of iterator.
Parameters
startStart iterator
endEnd iterator
Returns
true if at least one weight is positive

Definition at line 241 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ init()

void Brush::SearchSpace::init ( const Dataset & d,
const unordered_map< string, float > & user_ops = {},
bool weights_init = true )

Called by the constructor to initialize the search space.

Parameters
dA dataset containing terminal definitions
user_opsOptional user-provided dictionary of operators with their probability of being chosen
weights_initwhether the terminal prob_change should be estimated from correlations with the target value

Definition at line 166 of file search_space.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ make_classifier()

ClassifierProgram Brush::SearchSpace::make_classifier ( int max_d = 0,
int max_size = 0,
const Parameters & params = Parameters() )

Makes a random classifier program. Convenience wrapper for make_program.

Parameters
max_dmax depth of the program
max_sizemax size of the program
Returns
a classifier program

Definition at line 407 of file search_space.cpp.

Here is the call graph for this function:

◆ make_multiclass_classifier()

MulticlassClassifierProgram Brush::SearchSpace::make_multiclass_classifier ( int max_d = 0,
int max_size = 0,
const Parameters & params = Parameters() )

Makes a random multiclass classifier program. Convenience wrapper for make_program.

Parameters
max_dmax depth of the program
max_sizemax size of the program
Returns
a multiclass classifier program

Definition at line 412 of file search_space.cpp.

Here is the call graph for this function:

◆ make_program() [1/2]

template<typename P >
P Brush::SearchSpace::make_program ( const Parameters & params,
int max_d,
int max_size )

Definition at line 681 of file search_space.h.

Here is the call graph for this function:

◆ make_program() [2/2]

template<typename PT >
PT Brush::SearchSpace::make_program ( const Parameters & params,
int max_d = 0,
int max_size = 0 )

Makes a random program.

We use an implementation of PTC2 for strongly typed GP from

Sean Luke. "Two fast tree-creation algorithms for genetic programming" (https://doi.org/10.1109/4235.873237)

Template Parameters
PTprogram type
Parameters
max_dmax depth of the program
max_sizemax size of the programd
Returns
a program of type PTsize
Here is the caller graph for this function:

◆ make_regressor()

RegressorProgram Brush::SearchSpace::make_regressor ( int max_d = 0,
int max_size = 0,
const Parameters & params = Parameters() )

Makes a random regressor program. Convenience wrapper for make_program.

Parameters
max_dmax depth of the program
max_sizemax size of the program
Returns
a regressor program

Definition at line 402 of file search_space.cpp.

Here is the call graph for this function:

◆ make_representer()

RepresenterProgram Brush::SearchSpace::make_representer ( int max_d = 0,
int max_size = 0,
const Parameters & params = Parameters() )

Makes a random representer program. Convenience wrapper for make_program.

Parameters
max_dmax depth of the program
max_sizemax size of the program
Returns
a representer program

Definition at line 418 of file search_space.cpp.

Here is the call graph for this function:

◆ MakeNodes()

template<NodeType NT>
void Brush::SearchSpace::MakeNodes ( const unordered_map< string, float > & user_ops,
const vector< DataType > & unique_data_types )
inlineprivate

Definition at line 636 of file search_space.h.

Here is the call graph for this function:

◆ print()

void Brush::SearchSpace::print ( ) const

prints the search space map.

Definition at line 162 of file search_space.cpp.

◆ PTC2()

tree< Node > & Brush::SearchSpace::PTC2 ( tree< Node > & Tree,
tree< Node >::iterator root,
int max_d,
int max_size ) const
private

Definition at line 268 of file search_space.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ sample_op() [1/2]

std::optional< Node > Brush::SearchSpace::sample_op ( DataType ret) const
inline

get an operator matching return type ret.

Parameters
retreturn type
Returns
std::optional that may contain a randomly chosen operator matching return type ret

Definition at line 420 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ sample_op() [2/2]

std::optional< Node > Brush::SearchSpace::sample_op ( NodeType type,
DataType R )
inline

Get a specific node type that matches a return value.

Parameters
typethe node type
Rthe return type
Returns
std::optional that may contain a Node of type type with return type R.

Definition at line 454 of file search_space.h.

Here is the call graph for this function:

◆ sample_op_with_arg()

std::optional< Node > Brush::SearchSpace::sample_op_with_arg ( DataType ret,
DataType arg,
bool terminal_compatible = true,
int max_args = 0 ) const
inline

get operator with at least one argument matching arg

Parameters
retreturn type
argargument type to match
terminal_compatibleif true, the other args the returned operator takes must exist in the terminal types.
max_argsif zero, there is no limit on number of arguments of the operator. If not, the operator can have at most max_args arguments.
Returns
std::optional that may contain a matching operator respecting all restrictions.

Definition at line 492 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ sample_subtree()

std::optional< tree< Node > > Brush::SearchSpace::sample_subtree ( Node root,
int max_d,
int max_size ) const

create a subtree with maximum size and depth restrictions and root of type root_type

Parameters
root_typereturn type
max_dthe maximum depth
max_sizethe maximum size of the tree (will be sampled between [1, max_size])
Returns
std::optional that may contain a tree

Definition at line 235 of file search_space.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ sample_terminal() [1/2]

std::optional< Node > Brush::SearchSpace::sample_terminal ( bool force_return = false) const
inline

Get a random terminal.

Returns
std::optional that may contain a terminal Node.

Definition at line 319 of file search_space.h.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ sample_terminal() [2/2]

std::optional< Node > Brush::SearchSpace::sample_terminal ( DataType R,
bool force_return = false ) const
inline

Get a random terminal with return type R

Returns
std::optional that may contain a terminal Node of type R.

Definition at line 380 of file search_space.h.

Here is the call graph for this function:

Member Data Documentation

◆ node_map

Map<Node> Brush::SearchSpace::node_map

Maps return types to argument types to node types.

schema:

{ return_type : { arguments_type : {node_type : node } }}

Definition at line 100 of file search_space.h.

◆ node_map_weights

Map<float> Brush::SearchSpace::node_map_weights

A map of weights corresponding to elements in node_map, used to weight probabilities of each node being sampled from the map.

Definition at line 103 of file search_space.h.

◆ terminal_map

unordered_map<DataType, vector<Node> > Brush::SearchSpace::terminal_map

Maps return types to terminals.

schema:

 { return_type : vector of Nodes } 

Definition at line 113 of file search_space.h.

◆ terminal_types

vector<DataType> Brush::SearchSpace::terminal_types

A vector storing the available return types of terminals.

Definition at line 119 of file search_space.h.

◆ terminal_weights

unordered_map<DataType, vector<float> > Brush::SearchSpace::terminal_weights

A map of weights corresponding to elements in terminal_map, used to weight probabilities of each terminal being sampled from the map.

Definition at line 116 of file search_space.h.


The documentation for this struct was generated from the following files: