Idealization and Explainable AI

This abstract has open access
Abstract
AI systems are being used for a rapidly increasing number of important decisions. Many of these systems are “black boxes”: their functioning is opaque both to the people affected by them and to those developing them. This opacity is often due to the complexity of the model used by the AI system, and to the fact that these models are using machine learning techniques (Burrell 2016, Sullivan 2020). Black box AI systems are difficult to evaluate for accuracy and fairness, seem less trustworthy, and make it more difficult for affected individuals to seek recourse for undesirable decisions. Explainable AI (XAI) methods aim to alleviate the opacity of complex AI systems (Lakkaraju et al. 2020). These methods typically involve approximating the original black box system with a distinct “explanation model”. The original opaque model is used for actual recommendations or decision-making. Then, the explanation model provides an explanation for the original model’s output. However, there is debate about whether such methods can provide adequate explanations for the behavior of black box AI systems. This debate is made difficult by lack of agreement in the literature concerning what it means to give an adequate explanation. I argue that the goal of XAI methods should be to produce explanations that promote understanding for stakeholders. That is, a good explanation of an AI system is one that places relevant stakeholders in a position to understand why the system made a particular decision or recommendation. Moreover, I suggest that XAI methods can achieve this goal because (when things go well) the explanation models they produce serve as idealized representations of the original black box model. An idealization is an aspect of a scientific model that deliberately misrepresents its target to enable better understanding of that target (Elgin 2017). Even though idealizations are false, they can promote understanding by conferring a variety of benefits on a model (Potochnik 2017). An idealized model can be simpler, can leave out unimportant information, and can highlight specific causal patterns that might otherwise be obscured by the complexity of the system being represented. Recognizing that XAI methods produce idealized models can help illuminate how these methods function. This recognition can also guide decisions on when and whether specific methods should be employed. Certain kinds of idealizations will be apt for explaining a particular black box model to a particular audience. This in turn will help determine which XAI methods should be employed for providing those explanations. Whether an idealization is appropriate will depend on what benefits it will confer on an idealized model. For instance, consider feature importance methods that use linear equation models, such as LIME (Ribeiro et al 2016). These XAI methods employ idealizations that confer simplicity and legibility on the resulting explanation model. They eliminate information about causally unimportant features, while highlighting relevant causal patterns that are important for determining the original model’s output. These idealizations serve to promote understanding for non-technical stakeholders affected by an XAI system.
Abstract ID :
PSA202261
Submission Type
Assistant Professor of Philosophy
,
Georgetown University

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
PSA2022227
Philosophy of Climate Science
Symposium
Prof. Michael Weisberg
PSA2022211
Philosophy of Physics - space and time
Symposium
Helen Meskhidze
PSA2022165
Philosophy of Physics - general / other
Symposium
Prof. Jill North
PSA2022218
Philosophy of Social Science
Symposium
Dr. Mikio Akagi
PSA2022263
Values in Science
Symposium
Dr. Kevin Elliott
PSA202234
Philosophy of Biology - general / other
Symposium
Mr. Charles Beasley
PSA20226
Philosophy of Psychology
Symposium
Ms. Sophia Crüwell
PSA2022216
Measurement
Symposium
Zee Perry
131 visits