# The archive

When you fit a brush estimator, two new attributes are created: `best_estimator_` and `archive_`.

If you set `use_arch` to `True` when instantiating the estimator, then it will store the pareto front as a list in `archive_`. This pareto front is always created with individuals from the final population that are not dominated in objectives **error** and **complexity**.

In case you need more flexibility, the archive will contain the entire final population if `use_arch` is `False`, and you can iterate through this list to select individuals with different criteria. It is also good to remind that Brush supports different optimization objectives using the argument `objectives`.

Each element from the archive is a serialized individual (JSON object).

In [1]:
import pandas as pd
from pybrush import BrushClassifier

# load data
df = pd.read_csv('../examples/datasets/d_analcatdata_aids.csv')
X = df.drop(columns='target')
y = df['target']

In [2]:
est = BrushClassifier(
    functions=['SplitBest','Add','Mul','Sin','Cos','Exp','Logabs'],
    use_arch=True,
    max_gens=100,
    verbosity=1
)

est.fit(X,y)
y_pred = est.predict(X)
print('score:', est.score(X,y))

score: 0.7


You can see individuals from archive using the index:

In [3]:
print(len(est.archive_[0]))

est.archive_[0]

5


{'fitness': {'complexity': 80,
  'crowding_dist': 0.0,
  'dcounter': 0,
  'depth': 3,
  'dominated': [],
  'loss': 0.5091069936752319,
  'loss_v': 0.5091069936752319,
  'rank': 1,
  'size': 12,
  'values': [0.5091069936752319, 12.0],
  'weights': [-1.0, -1.0],
  'wvalues': [-0.5091069936752319, -12.0]},
 'id': 10060,
 'objectives': ['error', 'size'],
 'parent_id': [9628],
 'program': {'Tree': [{'W': 15890.5,
    'arg_types': ['ArrayF', 'ArrayF'],
    'center_op': True,
    'feature': 'AIDS',
    'fixed': False,
    'is_weighted': False,
    'name': 'SplitBest',
    'node_type': 'SplitBest',
    'prob_change': 1.0,
    'ret_type': 'ArrayF',
    'sig_dual_hash': 9996486434638833164,
    'sig_hash': 10001460114883919497},
   {'W': 1.0,
    'arg_types': ['ArrayF'],
    'center_op': True,
    'feature': '',
    'fixed': False,
    'is_weighted': False,
    'name': 'Logabs',
    'node_type': 'Logabs',
    'prob_change': 1.0,
    'ret_type': 'ArrayF',
    'sig_dual_hash': 10617925524997611780

And you can call `predict` (or `predict_proba`, if your `est` is an instance of `BrushClassifier`) with the entire archive:

In [4]:
est.predict_archive(X)


[{'id': 10060,
  'y_pred': array([False,  True,  True,  True,  True, False,  True,  True,  True,
         False,  True,  True,  True,  True, False,  True,  True,  True,
          True,  True,  True,  True,  True,  True,  True, False, False,
         False, False, False, False, False, False, False, False, False,
         False, False,  True, False,  True,  True,  True,  True,  True,
          True,  True,  True,  True,  True])},
 {'id': 9789,
  'y_pred': array([False,  True,  True,  True,  True, False,  True,  True,  True,
         False,  True,  True,  True,  True, False,  True,  True,  True,
          True,  True,  True,  True,  True,  True,  True, False, False,
         False, False, False, False, False, False, False, False, False,
         False, False,  True, False,  True,  True,  True,  True,  True,
          True,  True,  True,  True,  True])},
 {'id': 10049,
  'y_pred': array([False,  True,  True,  True,  True, False,  True,  True,  True,
         False, False,  True,  True, Fal

In [6]:
est.predict_proba_archive(X)

[{'id': 10060,
  'y_pred': array([0.22222222, 0.9999999 , 0.9999999 , 0.9999999 , 0.9999999 ,
         0.22222222, 0.9999999 , 0.9999999 , 0.9999999 , 0.22222222,
         0.5217871 , 0.9999999 , 0.9999999 , 0.5217871 , 0.22222222,
         0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 ,
         0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 ,
         0.22222222, 0.22222222, 0.22222222, 0.22222222, 0.22222222,
         0.22222222, 0.22222222, 0.22222222, 0.22222222, 0.22222222,
         0.22222222, 0.22222222, 0.22222222, 0.5217871 , 0.22222222,
         0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 ,
         0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 , 0.5217871 ],
        dtype=float32)},
 {'id': 9789,
  'y_pred': array([0.22222222, 0.99994993, 0.99994993, 0.99994993, 0.99994993,
         0.22222222, 0.99994993, 0.99994993, 0.99994993, 0.22222222,
         0.5217871 , 0.99994993, 0.99994993, 0.5217871 , 0.22222222,
         0.5217871 , 0.52178