SRBench: A Living Benchmark for Symbolic Regression
The methods for symbolic regression (SR) have come a long way since the days of Koza-style genetic programming (GP).
Our goal with this project is to keep a living benchmark of modern symbolic regression, in the context of state-of-the-art ML methods.
Currently these are the challenges, as we see it:
- Aggregated results obscure the current state-of-the-art.
- A one-size-fits-all approach is not ideal for benchmarking SR methods.
- The benchmark needs continuous updates with new datasets and algorithms.
We introduce new visualizations to highlight where each algorithm excels or struggles. We nearly double the number of SR methods evaluated under a unified experimental setup. We propose a deprecation scheme to guide the selection of methods for future editions. We propose a call for action for the community to think about what it takes for providing a good SR benchmark.
As before, we are making these comparisons open source, reproduceable and public, and hoping to share them widely with the entire ML research community.
When SRBench started, its challanges were:
- Lack of cross-pollination between the GP community and the ML community (different conferences, journals, societies etc)
- Lack of strong benchmarks in SR literature (small problems, toy datasets, weak comparator methods)
- Lack of a unified framework for SR, or GP
Benchmarked Methods
This benchmark currently consists of 25 symbolic regression methods, including the original 14 methods from previous SRBench, plus all staged methods for benchmarking, plus recent methods that have been published and included the SRBench results in their paper. We are using 24 datasets from PMLB, including real-world and synthetic datasets from processes with and without ground-truth models. We perform 30 independent runs for robust results comparisons.
There are 25 methods currently benchmarked:
| Method | ||
|---|---|---|
| AFP - paper | AFP_fe | AFP_ehc - paper |
| Bingo - paper | Brush - code | BSR - paper |
| E2E - paper | EPLEX - paper | EQL - paper |
| FEAT - paper | FFX - paper | Genetic Engine - paper |
| GPGomea - paper | GPlearn - paper | GPZGD - paper |
| ITEA - paper | NeSymRes - paper | Operon - paper |
| Ps-Tree - paper | PySR - paper | Qlattice - paper |
| Rils-rols - paper | TIR - paper | TPSR - paper |
| uDSR - paper |
Benchmark results
We made available all of our experiments’ results as feather files inside /results/.
Contributing and running it locally
Check out CONTRIBUTING.md file to see how to set up your algorithm. This guide will detail all the requirements in order to submit a pull request with a compatible interface with SRBench.
The analyze file is the main entry point as it will parse the flags and create specific python commands to run each experiment independently. Some examples on how to invoke the experiments are available at the docs/user_guide.md.
Reproducing the experiments
A detailed guide on how to reproduce the experiments by yourself is provided in docs/user_guide.md.
Once you get all the results, you nee to collate the results using the collate scripts in ./postprocessing/scripts
References
A paper containing the results from this repository is under review at GECCO 2025 Symbolic Regression Workshop.
A Call for action was reported in the GECCO 2025 paper:
Imai Aldeia, G. S., Zhang, H., Bomarito, G., Cranmer, M., Fonseca, A., Burlacu, B., La Cava, W., and de França, F. 2025. Call for Action: towards the next generation of symbolic regression benchmark. Proceedings of the Genetic and Evolutionary Computation Conference Companion doi, preprint
SRBench was reported in the Neurips 2021 paper:
La Cava, W., Orzechowski, P., Burlacu, B., de França, F. O., Virgolin, M., Jin, Y., Kommenda, M., & Moore, J. H. (2021). Contemporary Symbolic Regression Methods and their Relative Performance. Neurips Track on Datasets and Benchmarks. arXiv, neurips.cc
v1.0 was reported in the GECCO 2018 paper:
Orzechowski, P., La Cava, W., & Moore, J. H. (2018). Where are we now? A large benchmark study of recent symbolic regression methods. GECCO 2018. DOI, Preprint