Search CORE

190 research outputs found

Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources

Author: Greenland Sander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/01/2010
Field of study

In designed experiments and surveys, known laws or design feat ures provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks. The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.Comment: Published in at http://dx.doi.org/10.1214/09-STS291 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Comment: The Need for Syncretism in Applied Statistics

Author: Greenland Sander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/12/2010
Field of study

Comment on "The Need for Syncretism in Applied Statistics" [arXiv:1012.1161]Comment: Published in at http://dx.doi.org/10.1214/10-STS308A the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

The causal foundations of applied probability and statistics

Author: Greenland Sander
Publication venue
Publication date: 03/04/2021
Field of study

Statistical science (as opposed to mathematical statistics) involves far more than probability theory, for it requires realistic causal models of data generators - even for purely descriptive goals. Statistical decision theory requires more causality: Rational decisions are actions taken to minimize costs while maximizing benefits, and thus require explication of causes of loss and gain. Competent statistical practice thus integrates logic, context, and probability into scientific inference and decision using narratives filled with causality. This reality was seen and accounted for intuitively by the founders of modern statistics, but was not well recognized in the ensuing statistical theory (which focused instead on the causally inert properties of probability measures). Nonetheless, both statistical foundations and basic statistics can and should be taught using formal causal models. The causal view of statistical science fits within a broader information-processing framework which illuminates and unifies frequentist, Bayesian, and related probability-based foundations of statistics. Causality theory can thus be seen as a key component connecting computation to contextual information, not extra-statistical but instead essential for sound statistical training and applications.Comment: 22 pages; in press for Dechter, R., Halpern, J., and Geffner, H., eds. Probabilistic and Causal Inference: The Works of Judea Pearl. ACM book

arXiv.org e-Print Archive

Divergence vs. Decision P-values: A Distinction Worth Making in Theory and Keeping in Practice

Author: Greenland Sander
Publication venue
Publication date: 21/09/2023
Field of study

There are two distinct definitions of 'P-value' for evaluating a proposed hypothesis or model for the process generating an observed dataset. The original definition starts with a measure of the divergence of the dataset from what was expected under the model, such as a sum of squares or a deviance statistic. A P-value is then the ordinal location of the measure in a reference distribution computed from the model and the data, and is treated as a unit-scaled index of compatibility between the data and the model. In the other definition, a P-value is a random variable on the unit interval whose realizations can be compared to a cutoff alpha to generate a decision rule with known error rates under the model and specific alternatives. It is commonly assumed that realizations of such decision P-values always correspond to divergence P-values. But this need not be so: Decision P-values can violate intuitive single-sample coherence criteria where divergence P-values do not. It is thus argued that divergence and decision P-values should be carefully distinguished in teaching, and that divergence P-values are the relevant choice when the analysis goal is to summarize evidence rather than implement a decision rule.Comment: 49 pages. Scandinavian Journal of Statistics 2023, issue 1, with discussion and rejoinder in issue

arXiv.org e-Print Archive

Author's response to comments on "Epidemiologic Measures and Policy Formulation"

Author: Greenland Sander
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Connecting Simple and Precise P-values to Complex and Ambiguous Realities

Author: Greenland Sander
Publication venue
Publication date: 08/04/2023
Field of study

Mathematics is a limited component of solutions to real-world problems, as it expresses only what is expected to be true if all our assumptions are correct, including implicit assumptions that are omnipresent and often incorrect. Statistical methods are rife with implicit assumptions whose violation can be life-threatening when results from them are used to set policy. Among them are that there is human equipoise or unbiasedness in data generation, management, analysis, and reporting. These assumptions correspond to levels of cooperation, competence, neutrality, and integrity that are absent more often than we would like to believe. Given this harsh reality, we should ask what meaning, if any, we can assign to the P-values, 'statistical significance' declarations, 'confidence' intervals, and posterior probabilities that are used to decide what and how to present (or spin) discussions of analyzed data. By themselves, P-values and CI do not test any hypothesis, nor do they measure the significance of results or the confidence we should have in them. The sense otherwise is an ongoing cultural error perpetuated by large segments of the statistical and research community via misleading terminology. So-called 'inferential' statistics can only become contextually interpretable when derived explicitly from causal stories about the real data generator (such as randomization), and can only become reliable when those stories are based on valid and public documentation of the physical mechanisms that generated the data. Absent these assurances, traditional interpretations of statistical results become pernicious fictions that need to be replaced by far more circumspect descriptions of data and model relations.Comment: 25 pages. Body of text to appear as a rejoinder in the Scandinavian Journal of Statistic

arXiv.org e-Print Archive

Epidemiologic measures and policy formulation: lessons from potential outcomes

Author: Greenland Sander
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

This paper provides a critique of the common practice in the health-policy literature of focusing on hypothetical outcome removal at the expense of intervention analysis. The paper begins with an introduction to measures of causal effects within the potential-outcomes framework, focusing on underlying conceptual models, definitions and drawbacks of special relevance to policy formulation based on epidemiologic data. It is argued that, for policy purposes, one should analyze intervention effects within a multivariate-outcome framework to capture the impact of major sources of morbidity and mortality. This framework can clarify what is captured and missed by summary measures of population health, and shows that the concept of summary measure can and should be extended to multidimensional indices

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Discuss practical importance of results based on interval estimates and p-value functions, not only on point estimates and null p-values

Author: Amrhein Valentin
Greenland Sander
Publication venue: 'SAGE Publications'
Publication date: 01/01/2022
Field of study

edoc

Why Most Published Research Findings Are False: Problems in the Analysis

Author: Goodman Steven
Greenland Sander
Publication venue: Public Library of Science
Publication date: 01/04/2007
Field of study

Directory of Open Access Journals

PubMed Central