Exploratory analysis and the expected value of experimentation

This abstract has open access
Abstract
Astonishingly large datasets are now relatively easy to come by in many scientific fields. The availability of open datasets means that it is possible to acquire data on a problem without formulating any hypothesis whatsoever. The idea of an exploratory data analysis (EDA) predates this situation, but many researchers find themselves appealing to EDA as an explanation of what they are doing with these new resources. Yet there has been relatively little explicit work on what EDA is or why it might be important. I canvass several positions in the literature, find them wanting, and suggest an alternative: exploratory data analysis, when done well, shows the expected value of experimentation for a particular hypothesis. There are three main positions on EDA in the literature. The first identifies EDA with a set of techniques that can be applied to data in order to suggest hypotheses. Tukey (1969, 1977, 1993), who emphasized the “procedure-oriented” nature of exploratory analysis and the extent to which these techniques were “things that can be tried, rather than things that ‘must’ be done” (1993, 7). Hartwig and Dearing (2011, 10) similarly speak of EA as a “state of mind” or a “certain perspective” that one brings to the data. Yet this does not suggest any sort of success conditions for EDA—either in particular cases or for new techniques in general—and therefore offers little guidance on EDA as such. Second, EDA is sometimes treated as simply confirmatory data analysis done sloppily, with looser parameters and more freedom. Authors who suggest this view do so primarily to denigrate EDA (Wagenmakers et al., 2012). This is too pessimistic: charity demands that we prefer a model where authors who appeal to EDA are not simply covering up their sins as researchers. Third, EDA is sometimes linked to socalled exploratory experiments (Steinle, 1997; Franklin, 2005; Feest and Steinle, 2016). Exploratory experimentation is no doubt important, and the techniques of EDA can shed light on particular kinds of exploratory experimentation. Yet EDA also finds use in mature fields where phenomena have been stabilized and the basic theoretical menu is complete, suggesting EDA is related to but distinct from exploratory experimentation. I suggest instead that EDA is primarily concerned with finding hypotheses that would be easy to confirm or disconfirm if a proper experiment were to be done. The techniques associated with EDA are geared towards showing unexpected or striking effects. Whether these effects actually hold cannot be determined from the dataset: EDA also picks up artifacts of undirected data collection. (?) Nevertheless, proper confirmatory experiments are often costly and time consuming, and a good EDA shows where those costs should best be spent. Importantly, EDA tells us whether a hypothesis is worth testing without telling us whether it is likely to be true: rather, it tells us that we are likely to get an answer for a suitably low cost. I link this idea to related work on tradeoffs between information costs in political economics (Stigler, 1961) and Bayesian search theory (Stone, 1976). The resulting position shows why previous positions have the plausibility they do, while providing a principled framework for developing and evaluating EDA techniques.
Abstract ID :
PSA2022767
Submission Type
Topic 1
Australian National University

Abstracts With Same Type

Abstract ID
Abstract Title
Abstract Topic
Submission Type
Primary Author
PSA2022514
Philosophy of Biology - ecology
Contributed Papers
Dr. Katie Morrow
PSA2022405
Philosophy of Cognitive Science
Contributed Papers
Vincenzo Crupi
PSA2022481
Confirmation and Evidence
Contributed Papers
Dr. Matthew Joss
PSA2022440
Confirmation and Evidence
Contributed Papers
Mr. Adrià Segarra
PSA2022410
Explanation
Contributed Papers
Ms. Haomiao Yu
PSA2022504
Formal Epistemology
Contributed Papers
Dr. Veronica Vieland
PSA2022450
Decision Theory
Contributed Papers
Ms. Xin Hui Yong
PSA2022402
Formal Epistemology
Contributed Papers
Peter Lewis
103 visits