Abstract
A central problem for machine learning (ML) models is that they are “black boxes” and epistemically opaque. This means the inner workings of these models—how the model internally represents the data to reach a certain decision—are opaque or a “black box” to experts. This is concerning in health-care settings where such models are increasingly being used autonomously for high-stakes decision making. These concerns have led to a growing legal and ethical demand that the ML models be explainable if used in safety-critical domains. Explanations often require describing how the model represents the data or what the machine "sees" when it uses data to make a prediction. However, it is widely accepted that ML models are subject to an inherent and general tradeoff between predictive performance and explainability. The argument for the Tradeoff Thesis is based on model complexity. A more complex model is more accurate because of its complexity: it can train on, represent, and learn from a larger body of complex data. A more complex model (like a neural network) is less explainable because it combines that data using nonlinear functions, over multiple layers, and iteratively updates its outputs to optimize predictive skill. In contrast, a simpler model (like a decision tree) is more explainable in virtue of the rules encoded by human scientists but exhibits poorer predictive performance because of its rigidity. This Tradeoff Thesis reflects a long-standing philosophical position that describes prediction and explanation as two distinct, and often competing, theoretical virtues or epistemic goals. I challenge the Tradeoff Thesis using a case study of two deep learning systems that diagnose eye disease using retinal images. I then use my study of how explanation facilitates improved predictions in medical AI to support Heather Douglas’s (2009) argument for the tight practical and functional relation between prediction and explanation. In a case study, I demonstrate that improvements in the explainability of a deep learning system that uses representations of retinal lesions to detect diabetic retinopathy leads to improvements in predictive skill when compared to earlier studies that used simpler and more opaque models. I argue that the improved explainability facilitates improved predictive performance and that increased complexity is compatible with explainability. Furthermore, I compare explanations of DeepDR and its predictions with those of human ophthalmologists. I show how the explainability of DeepDR is on par with medical explanations provided by human doctors. An important consequence of my findings is that the Tradeoff Thesis must be proven to hold within a circumscribed set of models and cannot be presumed to hold rather generically “for all current and most likely future approaches to using ML for medical decision-making” (Heinrichs & Eickhoff 2020, 1437). Furthermore, this case illustrates how, in practice, prediction and explanation are deeply connected. This poses a challenge for philosophical models which construe the relation between prediction and explanation as one of epistemic rivals. Therefore, complex ML algorithms may still hold promise for reliable and ethical deployment in safety-critical fields like medicine. .