When deployed in healthcare settings, it’s important that models are fair - i.e., that they do not cause harm or unjustly benefit specific subgroups of a population. Otherwise, models deployed to assist in patient triage, for example, could exacerbate existing unfairness in the health system. There are many ways in which predictive health models can recapitulate and/or exacerbate systemic biases in treatment and outcomes. The field of fair ML provides a framework for requiring a notion of fairness to be maintained in models generated from data containing protected attributes (e.g. race and sex). What fairness means – perhaps equivalent error rates across groups, or similar treatment of similar individuals – varies considerably by application, and inherent conflicts can arise when asking for multiple types of fairness. Furthermore, there is a fundamental trade-off between the overall error rate of a model and its fairness (c.f. Fig. 1B here), and it is an open question how to best characterize and present these trade-offs to stakeholders in the health system. For example, we might want to prioritize fairness heavily in an algorithm used in patient triage, but weigh error rates more when predicting individual treatment plans and outcomes. Due to combinatorial challenges, fair models are hard to learn and audit when considering intersections of protected attributes (e.g. black males over 65). Thus, two open questions are how to best define the metrics for assessing intersectional definitions of fairness, and how to approximately satisfy them.
Providing a set of models (e.g. above) varying in fairness and accuracy is one way to aid a decision maker in understanding how an algorithm will affect the people it interacts with when it is deployed (1, 2). Once a model is deployed, we are interested in understanding the intricacies of downstream impacts on healthcare that will arise (3).