Abstract
One way machine learning (ML) modeling is different from more traditional modeling methods is that they are data-driven, instead of what Knüsel and Baumberger (2020) call process driven. Moreover, ML models suffer from a higher degree of model opacity compared to more traditional modeling methods. Despite these differences, modelers and philosophers (e.g. Sullivan 2020, Meskhidze 2021) have claimed that ML models can still provide understanding of phenomena. However, before the epistemic consequences of opacity become salient, there is an underexplored prior question of representation. If ML models do not represent their targets in any meaningful sense, how can ML models provide understanding? The problem is that it does in fact seem as though ML models do not represent their targets in any meaningful sense. For example, the similarity view of representation seems to exclude the possibility that ML models can represent phenomena. ML models use methods of finding feature relationships that are highly divorced from their target systems, such as relying on decision-rules and loose correlations instead of causal relationships. Moreover, the data that models are trained on can be manipulated by modelers in a way that reduces similarity. For example, the well-known melanoma detection ML model (Esteva et al. 2017) augments the RBG spectrum of dermatologist images (Tamir and Shech 2022). Thus, if the similarity view is right, then even if model opacity qua opacity does not get in the way of understanding, ML models may still fail to enable understanding of phenomena because they fail to represent phenomena. Contra to the similarity view, I argue that ML models are in fact able to represent phenomena, under specific conditions. Drawing on the literature of how highly idealized models represent their targets, and the interpretative view of representation (Nguyen 2020), a strong case can be made that ML models can accurately represent their targets. Even though ML models seem to be the opposite of highly idealized simple models, there are a number of representational similarities between them. Thus, if we accept that highly idealized models can represent phenomena, then so can ML models. References Knüsel, B., and Baumberger, C. (2020): Understanding climate phenomena with data-driven models. Studies in History and Philosophy of Science Part A, 84, 46-56. Meskhidze, H. (2021). Can Machine Learning Provide Understanding? How Cosmologists Use Machine Learning to Understand Observations of the Universe. Erkenntnis, 1-15. Nguyen, J. (2020). It’s not a game: Accurate representation with toy models. The British Journal for the Philosophy of Science, 71(3), 1013-1041. Sullivan, E.(2020): Understanding from Machine Learning Models. In The British Journal for the Philosophy of Science. DOI: 10.1093/bjps/axz035. Esteva, A.; Kuprel, B.; Novoa, R. A.; Ko, J.; Swetter, S. M.; Blau, H. M.; Thrun, Seb. (2017): Dermatologist-level classification of skin cancer with deep neural networks. In Nature 542 (7639), pp. 115–118. DOI: 10.1038/nature21056.