Appreciating the role of measurement -and its difficulties- in the validity of scientific research claims
Rochelle Tractenberg
“Big Data” is emerging as a priority for funders of research around the world, and the abilities to mine large, multidimensional datasets are at a premium. In genomics, proteomics, metabolomics, and other informatics approaches to biomedical data, one feature is both critical and also totally understudied: the accuracy of the endpoint or outcome against which the informatics results are compared, benchmarked, or with which they are validated. There are many outcomes that are assessed highly accurately (e.g., up-regulation, genotype, presence or absence of biomarkers for heart attack, cancer, and other diseases) – and to which informatics applications are straightforward (although highly complex). However, other outcomes of equal interest are far less accurately assessed; scientific claims based on these less-accurately determined outcomes are far less valid than might be hoped.
This talk will explore the implications of measurement weaknesses that complicate the applications of mining and modeling techniques to biomedical challenges arising in psychology, neurology, pain, and other medical domains including the study of Alzheimer’s disease (AD) and traumatic brain injury (TBI). In diseases/cases like these, phenotypes evolve (in TBI on a very short time scale; in AD on a much longer time scale) and the assessment of symptoms and even establishing a diagnosis requires highly sophisticated, multidimensional, and essentially unscalable team-based approaches. Detecting changes in the key symptoms is extremely difficult – in TBI due to the highly variable time scale of the injury evolution (physiology, injury parameters, and individual factors), and in both TBI and AD due to high levels of measurement error in symptom detection/instrumentation. The phenotypic variability and measurement weaknesses affect the potential utility of Big Data, as well as the applicability of methods and techniques that are developed specifically for Big Data. This talk will outline these measurement weaknesses and discuss how they can undermine the validity of claims based on current techniques, which will extend to the application of techniques from Big Data.