Brush C++ API
A flexible interpretable machine learning framework
|
holds variable type data. More...
#include <data.h>
Public Member Functions | |
Dataset | operator() (const vector< size_t > &idx) const |
return a slice of the data using indices idx | |
void | init () |
call init at the end of constructors to define metafeatures of the data. | |
map< string, State > | make_features (const ArrayXXf &X, const map< string, State > &Z={}, const vector< string > &vn={}) |
turns input data into a feature map | |
map< string, State > | copy_and_make_features (const ArrayXXf &X, const Dataset &ref_dataset, const vector< string > &vn={}) |
turns input into a feature map, with feature types copied from a reference | |
Dataset (std::map< string, State > &d, const Ref< const ArrayXf > &y_=ArrayXf(), bool c=false, float validation_size=0.0, float batch_size=1.0) | |
Dataset (const ArrayXXf &X, const Ref< const ArrayXf > &y_=ArrayXf(), const vector< string > &vn={}, const map< string, State > &Z={}, bool c=false, float validation_size=0.0, float batch_size=1.0) | |
Dataset (const ArrayXXf &X, const vector< string > &vn, bool c=false, float validation_size=0.0, float batch_size=1.0) | |
Dataset (const ArrayXXf &X, const Dataset &ref_dataset, const vector< string > &vn, bool c=false) | |
void | print () const |
auto | get_X () const |
Dataset | get_training_data () const |
Dataset | get_validation_data () const |
int | get_n_samples () const |
int | get_n_features () const |
Dataset | get_batch () const |
select random subset of data for training weights. | |
float | get_batch_size () |
void | set_batch_size (float new_size) |
std::array< Dataset, 2 > | split (const ArrayXb &mask) const |
State | operator[] (std::string name) const |
Public Attributes | |
std::vector< DataType > | unique_data_types |
keeps track of the unique data types in the dataset. | |
std::vector< DataType > | feature_types |
types of data in the features. | |
std::unordered_map< DataType, vector< string > > | features_of_type |
map from data types to features having that type. | |
std::map< string, State > | features |
dataset features, as key value pairs | |
ArrayXf | y |
length N array, the target label | |
bool | classification |
whether this is a classification problem | |
std::optional< std::reference_wrapper< const ArrayXXf > > | Xref |
float | validation_size |
percentage of original data used for train. if 0.0, then all data is used for train and validation | |
bool | use_validation |
float | batch_size |
percentage of training data size to use in each batch. if 1.0, then all data is used | |
bool | use_batch |
Private Attributes | |
vector< size_t > | training_data_idx |
vector< size_t > | validation_data_idx |
Dataset Brush::Data::Dataset::get_batch | ( | ) | const |
|
inline |
|
inline |
Dataset Brush::Data::Dataset::get_training_data | ( | ) | const |
Dataset Brush::Data::Dataset::get_validation_data | ( | ) | const |
void Brush::Data::Dataset::init | ( | ) |
|
inline |
|
inline |
float Brush::Data::Dataset::batch_size |
bool Brush::Data::Dataset::classification |
std::vector<DataType> Brush::Data::Dataset::feature_types |
std::map<string, State> Brush::Data::Dataset::features |
std::unordered_map<DataType,vector<string> > Brush::Data::Dataset::features_of_type |
|
private |
std::vector<DataType> Brush::Data::Dataset::unique_data_types |
|
private |
float Brush::Data::Dataset::validation_size |
std::optional<std::reference_wrapper<const ArrayXXf> > Brush::Data::Dataset::Xref |
ArrayXf Brush::Data::Dataset::y |