Intelligible Predictive Health Models
We study both blackbox and glassbox ML methods to improve the intelligibility and/or explainability of models that are trained for clinical prediction tasks using electronic health record (EHR) data. EHR data offer a promising opportunity for advancing the understanding of how clinical decisions and patient conditions interact over time to influence patient health. However, EHR data are difficult to use for predictive modeling due to the various data types they contain (continuous, categorical, text, etc.), their longitudinal nature, the high amount of nonrandom missingness for certain measurements, and other concerns. Furthermore, patient outcomes often have heterogeneous causes and require information to be synthesized from several clinical lab measures and patient visits. Researchers often resort to using complex, blackbox predictive models to overcome these challenges, thereby introducing additional concerns of accountability, transparency and intelligibility.
Can’t we just explain blackbox models?
Although blackbox models are typically accurate, they are often bad at explaining how they arrive at those predictions, and may also disagree with very similar models about which factors are driving their predictive ability [1].
Feature importance biclustering across diseases and predictors [3].
Symbolic Regression for Interpretable Machine Learning
An alternative, and promising approach, is to use glassbox ML methods such as symbolic regression that can capture complex relationships in data and yet produce and intelligible final model. Symbolic regression methods jointly optimize structure of a model, as well as its parameters, usually with the goal of finding a simple and accurate symbolic model.
However, intelligibility is complicated to define, and is both context and userdependent. In general, the intelligibility of a model depends heavily on its representation, i.e, how it defines its feature space.
An example representation from the Feat docs.
What makes a representation good? At the minimum, a good representation produces a model with better generalization than a model trained only on the raw data attributes. In addition, a good representation teases apart the factors of variation in the data into independent components. Finally, an ideal representation is succinct so as to promote intelligibility. This means a representation should only have as many features as there are independent factors in the process, and each of those features should be digestible by the user. Many of our research projects center around these three motivations when designing novel algorithms for interpretable machine learning.
Can a simple symbolic model be accurate?
Researchers often see the complexity of a model as a tradeoff with its error: more complex models should give better predictions than simple ones. However, very rarely is the nature of the tradeoff actually characterized in a robust way.
In fact, what we have found is that for many tasks, symbolic regression approaches can perform as well as or better than stateoftheart blackbox approaches  and still produce simpler expressions.
Symbolic regression algorithms (marked with asterisk) benchmarked against blackbox ML on hundreds of regression problems. See more at https://github.com/EpistasisLab/srbench.
Do they work in clinical care?
Our preliminary work on symbolic regression approaches to patient phenotyping have shown success in producing accurate and interpretable models of treatment resistant hypertension. More work is needed to scale and study these algorithms in routine clinical care.
A symbolic regression model of treatment resistant hypertension [2].
Relevant work:

La Cava, W., Bauer, C. R., Moore, J. H., & Pendergrass, S. A. (2019). Interpretation of machine learning predictions for patient outcomes in electronic health records. AMIA 2019 Annual Symposium. arXiv

La Cava, W., Lee, P.C., Ajmal, I., Ding, X., Cohen, J.B., Solanki, P., Moore, J.H., and Herman, D.S (2021). Application of concise machine learning to construct accurate and interpretable EHR computable phenotypes. medRxiv,

La Cava, W. & Moore, J.H. (2020). Learning feature spaces for regression with genetic programming. Genetic Programming and Evolvable Machines (GPEM). link, pdf

La Cava, W., & Moore, J. H. (2019). Semantic variation operators for multidimensional genetic programming. GECCO 2019. https://doi.org/10.1145/3321707.3321776. arXiv

La Cava, W., & Moore, J. H. (2019). Learning concise representations for regression by evolving networks of trees. ICLR 2019. arXiv

La Cava, W., & Moore, J. (2017). A General Feature Engineering Wrapper for Machine Learning Using epsilonLexicase Survival. European Conference on Genetic Programming.
link, preprint 
La Cava, W., & Moore, J. H. (2017). Ensemble representation learning: an analysis of fitness and survival for wrapperbased genetic programming methods. GECCO ’17 (pp. 961–968). Berlin, Germany: ACM. link, arXiv

La Cava, W., Silva, S., Vanneschi, L., Spector, L., & Moore, J. (2017). Genetic Programming Representations for Multidimensional Feature Learning in Biomedical Classification. Applications of Evolutionary Computation (pp. 158–173). Springer, Cham. link, preprint

La Cava, W., Silva, S., Danai, K., Spector, L., Vanneschi, L., & Moore, J. H. (2018). Multidimensional genetic programming for multiclass classification. Swarm and Evolutionary Computation. link, preprint

La Cava, W., Orzechowski, P., Burlacu, B., França, F. O. de, Virgolin, M., Jin, Y., Kommenda, M., & Moore, J. H. (2021). Contemporary Symbolic Regression Methods and their Relative Performance. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (Accepted). arXiv, repo