Sterlings 3
Nov 12, 2022 09:00 AM - 11:45 AM(America/New_York)
20221112T0900 20221112T1145 America/New_York Measuring The Human: New Developments

Although measurement is widespread across the human sciences, the reliability of measurement in these disciplines is often contested. Philosophers of science have developed conceptual models for how measurement practice progresses in the natural sciences, highlighting in particular the virtuous co-development of theoretical understanding and measurement procedures. The extent to which these accounts of measurement are applicable outside the natural sciences, however, remains unclear. Measurement in the human sciences faces a number of specific challenges, which are related to the peculiarities of the phenomena under study. For instance, since nomological networks do not abound in the human sciences, measurement has less theoretical resources to draw from. Moreover, human scientists and philosophers of science debate about the very measurability of the complex and multidimensional properties of interest in these disciplines. Finally, many of the properties measured in the human sciences are value-laden and context-dependent and this raises questions about the possibility of having standardized measurements, which are valid across different contexts and distinct ethical grounds. In an attempt to enrich the philosophical accounts of measurement practice in the human sciences, this symposium addresses these challenges and evaluates scientists' strategies to deal with them. This symposium's participants include both early-career, mid-career, and more senior scholars, and major contributors to the philosophy of measurement. Professor David Sherry will give brief comments after each talk.

Sterlings 3 PSA 2022 office@philsci.org
38 attendees saved this session

Although measurement is widespread across the human sciences, the reliability of measurement in these disciplines is often contested. Philosophers of science have developed conceptual models for how measurement practice progresses in the natural sciences, highlighting in particular the virtuous co-development of theoretical understanding and measurement procedures. The extent to which these accounts of measurement are applicable outside the natural sciences, however, remains unclear. Measurement in the human sciences faces a number of specific challenges, which are related to the peculiarities of the phenomena under study. For instance, since nomological networks do not abound in the human sciences, measurement has less theoretical resources to draw from. Moreover, human scientists and philosophers of science debate about the very measurability of the complex and multidimensional properties of interest in these disciplines. Finally, many of the properties measured in the human sciences are value-laden and context-dependent and this raises questions about the possibility of having standardized measurements, which are valid across different contexts and distinct ethical grounds. In an attempt to enrich the philosophical accounts of measurement practice in the human sciences, this symposium addresses these challenges and evaluates scientists' strategies to deal with them. This symposium's participants include both early-career, mid-career, and more senior scholars, and major contributors to the philosophy of measurement. Professor David Sherry will give brief comments after each talk.

Measurement, Hermeneutics and StandardizationView Abstract
SymposiumMeasurement 09:00 AM - 11:45 AM (America/New_York) 2022/11/12 14:00:00 UTC - 2022/11/12 16:45:00 UTC
In contemporary philosophy of measurement prominent philosophers (van Fraassen 2008; Chang 2004; Tal 2011) have explicitly or implicitly recognized the role the hermeneutic circle plays in measurement. Specifically, they have recognized its role in what is sometimes referred to as the “coordination problem”. Yet in these accounts the hermeneutic aspect of measurement is often minimized, giving way to standardization, modeling and other concerns. In this essay I discuss the tension between the hermeneutics and standardization of measurement and offer an alternative account of measurement. In my account, the hermeneutic circle is the constant companion of measurement with standardization making time limited appearances. The coordination problem asks how we imbue our measuring instruments with empirical significance. In other words, how do we coordinate our measuring instruments with the phenomena we want them to assess? In the empirical literature on measurement, the coordination problem is sometimes discussed in terms of validity, i.e. ensuring a measuring instrument measures what it intends to measure. The problem associated with coordination (or validity) is that it confronts a circle: If I want to know if my measuring instrument does a good job of capturing the phenomena of interest--say temperature or humidity or quality of life--then it seems that I need to know already a great deal about temperature, humidity or quality of life. I need to know, for instance, how temperature fluctuates across locations or people at a single point in time, or how quality of life changes with disease trajectory. Yet this information is precisely what the measuring instrument is designed to provide. So, how can we ever coordinate our instruments? To answer this question, I examine Hasok Chang’s discussion of coherentism in measurement. As I will illustrate, his proposal has much in common with philosophical hermeneutics (Gadamer, 2004), nonetheless, it emphasizes the stabilization of the hermeneutic circle over time. We might think of this stabilization as a point in time when we know enough about the phenomena of interest such that all the questions we want to ask (for a particular purpose) are answered by the measuring instrument. Once we reach stabilization, if the measuring instrument gives us an answer we don’t expect, we tend to call it error or bias. Achieving stability usually means that the phenomena of interest can be standardized, and at least for some metrologists, measurement has been achieved. Yet when we look closer, standards get revised, some phenomena are never standardized, some measures are never stabilized, and questions of coordination continue to haunt measurement well-beyond their sell-by date. What is going on? I suggest that the quintessence of measurement is not standardization, but rather hermeneutic dialogue. Sometimes this dialogue becomes stagnant, stability and standardization ensue. But this is the exception and not the rule. Indeed, scientific progress relies on it.
Presenters
LM
Leah McClimans
Speaker, University Of South Carolina
Concepts of inequality and their measurementView Abstract
SymposiumMeasurement 09:00 AM - 11:45 AM (America/New_York) 2022/11/12 14:00:00 UTC - 2022/11/12 16:45:00 UTC
Inequality measurements are widely used by scientists and policy makers. Social scientists use them to analyze the global distribution of income and trends over time. In policymaking, inequality measurements contribute to inform redistributive policies at national level, and to set the agenda for international development and foreign aid. Inequality measurements are expected to objectively arbitrate in the design, selection and implementation of policy in these areas. The measurement of inequality, however, is far from straightforward, and scientists disagree about what is the best way to conceptualize inequality and what is the most appropriate method for measuring it. As a result, the policies based on these measurements are also called into question. One of the main questions is what exactly should be measured. While measurements typically focus on income or wealth inequality, there is increasing awareness that inequality is multidimensional and that other aspects of people’s well-being (like health, education, and political freedoms) should be measured too. Moreover, scientists have stressed the importance of measuring the inequality of opportunities rather than looking merely at the inequality of outcomes, and highlighted the relevance of investigating people’s subjective perception of economic disparities for designing successful inequality-reducing policies. The problem is that no measurement can take into account all aspects of inequality at the same time, and scientists disagree about which aspects should be taken into account and why. Measurement practice requires scientists to find context-dependent compromises between conceptual and procedural desiderata. As a consequence, scientific practice relies on a variety of narrow, contextual concepts, but this raises questions for using the outcomes of these measurements outside the narrow scope for which they were initially designed. This paper investigates how these narrow concepts of inequality are related to each other, and to the broader and multidimensional notions implementers are interested in. By looking in particular at the relations between subjective and objective measurements of inequality, I highlight the challenges that arise when investigating the relations between inequality parameters that are measured using different methodologies. However, I also defend the idea that conceptual analogies can be used to establish high-order relations between distinct dimensions of inequality. While inequality is measured differently across contexts, there is a sense in which these are all related to a common underlying concept of inequality, which can provide the basis for comparison and aggregation. This highlights the need for deeper theoretical understanding of how multiple dimensions are related to each other.
Presenters Alessandra Basso
Speaker , University Of Cambridge
Is Measurement in the Social Sciences Doomed? A Response to Joel MichellView Abstract
SymposiumMeasurement 09:00 AM - 11:45 AM (America/New_York) 2022/11/12 14:00:00 UTC - 2022/11/12 16:45:00 UTC
Whether widely used measures in the human sciences—e.g., measures of intelligence, happiness, empowerment, depression, etc.—count as quantitative remains a battlefront. Practitioners commonly analyze their data assuming that their measures are quantitative, but many methodologists reject this presupposition. Other authors acknowledge that current measures might not be strictly quantitative, but taking inspiration from recent philosophy of measurement, they express optimism about future human science measurements. Is the optimism of the latter camp warranted? Joel Michell’s more recent work (2012) provides reasons to the contrary. He argues not only that current measures aren’t quantitative, but that the attributes at stake (intelligence, etc.) are themselves not quantitative. Hence, these attributes cannot (thus, will not) afford quantitative measurement. Michell’s influential argument draws from a long tradition (including von Kries and Keynes). But I focus on Michell’s argument, because its scope is wider. My goal is to demonstrate his argument fails in showing that common human science attributes are not-quantitative. Michell argues that the key feature indicating that attributes are not quantitative is their lack of “pure homogeneity.” When we consider the different degrees of some quantity—e.g., 3cms, 4cms, and 6cms—we realize that they are all degrees of the very same kind; they differ only quantitatively, but not qualitatively. The same is true for the differences between these degrees—e.g., the interval between 6cms and 4cms and the interval between 4cms and 3cms don’t differ in kind, their only difference is that the former is twice the latter. In contrast, in not-quantitative (but ordinal) attributes, says Michell, we don’t observe proper homogeneity: although we can order different degrees, we cannot order the differences between these degrees. Crucially, Michell’s point (in this more recent work) is not epistemic. His claim is that these differences do not stand in ordering relations because the attribute is not purely homogeneous (i.e. the differences between degrees are qualitatively different). Michell believes common attributes in the human sciences are heterogeneous in this sense. He illustrates the argument with the attribute ‘functional independence’. He considers a typical scale for measuring functional independence, and concludes that functional independence is merely ordinal since the differences between degrees indicated in the scale are qualitatively different. However, Michell’s argument misses the mark. We should distinguish between the actual target of our measurements—the theoretical attribute to be measured, the ‘measurand’—and the (empirically accessible) measuring attribute we use to infer values of the measurand. This distinction and the working assumption that (some) measurands are quantitative lie behind psychometricians’ understanding of measurement that Michell targets. Those assumptions are also part of influential contemporary accounts of measurement such as Eran Tal’s and David Sherry’s. Yet Michell’s argument overlooks this distinction, conflating the measurand with the measuring attribute—Michell’s argument only demonstrates heterogeneity in the scale for measuring functional independence, leaving open whether functional independence itself is heterogeneous. I show that the former doesn’t entail the latter and suggest that this generalizes to other attributes.
Presenters Cristian Larroulet Philippi
University Of Cambridge
Theory and measurement in psychologyView Abstract
SymposiumMeasurement 09:00 AM - 11:45 AM (America/New_York) 2022/11/12 14:00:00 UTC - 2022/11/12 16:45:00 UTC
In recent years, more and more authors have called attention to the fact that the theoretical foundations of psychology are shaky. This has led to a lively debate on the “theory crisis” in psychology, which is argued to be more fundamental than the replication crisis that has received much more attention. In this talk, I first consider why there are so few good theories in psychology, and why psychology differs in this respect from other fields, and then argue that the lack of good psychological theories also creates fundamental challenges to psychological measurement. First, there has been insufficient attention on the conceptual clarity of psychological constructs. The same construct is often operationalized in wildly divergent ways in different fields, or different constructs are created for the same underlying phenomenon. For example, there are over 30 different constructs related to “perceived control”. The result is that psychology is permeated with numerous constructs and concepts of insufficient clarity, which is a problem for theory construction, as concepts are the building blocks of theories. Moreover, this lack of conceptual clarity is also closely linked to problems of psychological measurement: It is hard to provide valid measurements of constructs that are not well defined, as the discussion on the measurement of happiness and well-being illustrates. Strikingly, most studies in psychology report little or no validity evidence whatsoever for the constructs used. Second, psychological states are difficult to directly intervene on, and effects of interventions are hard to reliably track, which poses great challenges for establishing psychological causes or mechanisms. More specifically, interventions on psychological variables such as affects states or symptoms are not “surgical” but “fat-handed” in the sense that they change several variables at once. This makes it extremely difficult to infer causal relationships between psychological variables, and insofar as theories should track causal relationships, this hinders the development of good psychological theories. In addition, it is widely thought that valid measurement requires establishing a causal relationship between the attribute that is measured (e.g., temperature) and the measurement outcome (e.g., thermometer readings). Insofar as this is the case, the problem of psychological interventions is also directly a problem for psychological measurement. In light of these issues, it is understandable that psychological theories tend to come and go, without much cumulative progress, and that the very possibility of psychological measurement continues to be debated. However, I will end the talk on a positive note, considering some ways of making progress in psychology: Focusing more on conceptual clarification instead of just statistics and experiments; and embracing a holistic and pragmatic approach, where measurement, theorizing, and conceptual clarification are seen as necessary parts of an ongoing iterative cycle.
Presenters
ME
Markus Eronen
University Of Groningen
Fitness for purpose in psychometricsView Abstract
SymposiumMeasurement 09:00 AM - 11:45 AM (America/New_York) 2022/11/12 14:00:00 UTC - 2022/11/12 16:45:00 UTC
Much recent philosophical attention has been given to the concept of validity in psychometrics (Alexandrova 2017; Angner 2013; McClimans 2010). By contrast, the question of whether and when a psychometric instrument is fit for its intended purpose has been largely neglected. Here we argue that fitness for purpose is a distinct feature of a psychometric measure that does not automatically follow from its validity, and is established by distinct sources of evidence. We focus on applications of psychometrics in healthcare, and specifically on the use of patient-reported outcome measures (PROMs) in mental healthcare. PROMs such as the Patient Health Questionnaire (PHQ-9) and the Kessler Psychological Distress Scale (K-10) are routinely used by mental health service providers for various purposes, including screening patients, assisting with diagnosis, recommending treatment plans, tracking patient progress, and assessing overall quality of care. Health outcomes researchers acknowledge that a PROM designed and validated for one purpose and population, such as screening in adults, may not be fit to serve another, such as tracking patient progress in youth. This context-sensitivity is partially due to differences in patient characteristics, and to the fact that different clinical decisions can require different kinds of evidence. Health outcomes researchers typically deal with this context-sensitivity by ‘re-validating’ PROMs against ‘gold standards’ of evidence, e.g., by adjusting the severity thresholds of a screening tool against the outcomes of clinical interviews in new settings. This paper argues that ‘re-validation’ techniques are inadequate for establishing fitness-for-purpose across contexts, because they are based on an overly narrow concept of fitness-for-purpose. Fitness-for-purpose in psychometrics is not only an epistemic criterion, but also an ethical criterion, namely, the condition of fit between the meanings and uses of a measure and the values and aims of stakeholders. Consequently, evaluating fitness-for-purpose requires a thorough examination of the ethics of measurement. We substantiate our claims with the results of a recent project in which we collaborated with psychometricians, clinicians, and young people. As part of this collaboration, philosophers of science helped develop a training in measurement for clinicians working at Foundry, a network of integrated mental health clinics for people aged 12-24 in British Columbia. Our research revealed a gap between psychometric evaluation techniques, which focus on statistical properties, and the need of clinicians and patients to identify measures that promote ethical and social values, such as inclusiveness, empowerment and collaboration. Our analysis highlights the need for a normative theory of measurement as a foundation for measure evaluation in psychometrics. Although some validation theorists have paid close attention to the ethics of measurement, they overemphasized the importance of avoiding negative social consequences. Building on McClimans’ (2010), we show that fitness-for-purpose is a stronger requirement than Messick’s ‘consequential validity’, and involves using measurement as a tool for genuine dialogue between clinician and patient.
Presenters
SR
Sebastian Rodriguez Duque
McGill University
Co-Authors
ET
Eran Tal
McGill University
speaker
,
University of South Carolina
speaker
,
University of Cambridge
University of Cambridge
University of Groningen
+ 2 more speakers. View All
University of Cambridge
No attendee has checked-in to this session!
Upcoming Sessions
645 visits