10,422 research outputs found
Error-free milestones in error prone measurements
A predictor variable or dose that is measured with substantial error may
possess an error-free milestone, such that it is known with negligible error
whether the value of the variable is to the left or right of the milestone.
Such a milestone provides a basis for estimating a linear relationship between
the true but unknown value of the error-free predictor and an outcome, because
the milestone creates a strong and valid instrumental variable. The inferences
are nonparametric and robust, and in the simplest cases, they are exact and
distribution free. We also consider multiple milestones for a single predictor
and milestones for several predictors whose partial slopes are estimated
simultaneously. Examples are drawn from the Wisconsin Longitudinal Study, in
which a BA degree acts as a milestone for sixteen years of education, and the
binary indicator of military service acts as a milestone for years of service.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS233 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Isolation in the construction of natural experiments
A natural experiment is a type of observational study in which treatment
assignment, though not randomized by the investigator, is plausibly close to
random. A process that assigns treatments in a highly nonrandom, inequitable
manner may, in rare and brief moments, assign aspects of treatments at random
or nearly so. Isolating those moments and aspects may extract a natural
experiment from a setting in which treatment assignment is otherwise quite
biased, far from random. Isolation is a tool that focuses on those rare, brief
instances, extracting a small natural experiment from otherwise useless data.
We discuss the theory behind isolation and illustrate its use in a reanalysis
of a well-known study of the effects of fertility on workforce participation.
Whether a woman becomes pregnant at a certain moment in her life and whether
she brings that pregnancy to term may reflect her aspirations for family,
education and career, the degree of control she exerts over her fertility, and
the quality of her relationship with the father; moreover, these aspirations
and relationships are unlikely to be recorded with precision in surveys and
censuses, and they may confound studies of workforce participation. However,
given that a women is pregnant and will bring the pregnancy to term, whether
she will have twins or a single child is, to a large extent, simply luck. Given
that a woman is pregnant at a certain moment, the differential comparison of
two types of pregnancies on workforce participation, twins or a single child,
may be close to randomized, not biased by unmeasured aspirations. In this
comparison, we find in our case study that mothers of twins had more children
but only slightly reduced workforce participation, approximately 5% less time
at work for an additional child.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS770 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Cross-screening in observational studies that test many hypotheses
We discuss observational studies that test many causal hypotheses, either
hypotheses about many outcomes or many treatments. To be credible an
observational study that tests many causal hypotheses must demonstrate that its
conclusions are neither artifacts of multiple testing nor of small biases from
nonrandom treatment assignment. In a sense that needs to be defined carefully,
hidden within a sensitivity analysis for nonrandom assignment is an enormous
correction for multiple testing: in the absence of bias, it is extremely
improbable that multiple testing alone would create an association insensitive
to moderate biases. We propose a new strategy called "cross-screening",
different from but motivated by recent work of Bogomolov and Heller on
replicability. Cross-screening splits the data in half at random, uses the
first half to plan a study carried out on the second half, then uses the second
half to plan a study carried out on the first half, and reports the more
favorable conclusions of the two studies correcting using the Bonferroni
inequality for having done two studies. If the two studies happen to concur,
then they achieve Bogomolov-Heller replicability; however, importantly,
replicability is not required for strong control of the family-wise error rate,
and either study alone suffices for firm conclusions. In randomized studies
with a few hypotheses, cross-split screening is not an attractive method when
compared with conventional methods of multiplicity control, but it can become
attractive when hundreds or thousands of hypotheses are subjected to
sensitivity analyses in an observational study. We illustrate the technique by
comparing 46 biomarkers in individuals who consume large quantities of fish
versus little or no fish.Comment: 33 pages, 2 figures, 5 table
Stronger instruments via integer programming in an observational study of late preterm birth outcomes
In an optimal nonbipartite match, a single population is divided into matched
pairs to minimize a total distance within matched pairs. Nonbipartite matching
has been used to strengthen instrumental variables in observational studies of
treatment effects, essentially by forming pairs that are similar in terms of
covariates but very different in the strength of encouragement to accept the
treatment. Optimal nonbipartite matching is typically done using network
optimization techniques that can be quick, running in polynomial time, but
these techniques limit the tools available for matching. Instead, we use
integer programming techniques, thereby obtaining a wealth of new tools not
previously available for nonbipartite matching, including fine and near-fine
balance for several nominal variables, forced near balance on means and optimal
subsetting. We illustrate the methods in our on-going study of outcomes of
late-preterm births in California, that is, births of 34 to 36 weeks of
gestation. Would lengthening the time in the hospital for such births reduce
the frequency of rapid readmissions? A straightforward comparison of babies who
stay for a shorter or longer time would be severely biased, because the
principal reason for a long stay is some serious health problem. We need an
instrument, something inconsequential and haphazard that encourages a shorter
or a longer stay in the hospital. It turns out that babies born at certain
times of day tend to stay overnight once with a shorter length of stay, whereas
babies born at other times of day tend to stay overnight twice with a longer
length of stay, and there is nothing particularly special about a baby who is
born at 11:00 pm.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS582 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Recommended from our members
Results of an aqueous source term model for a radiological risk assessment of the Drigg LLW Site, U.K.
A radionuclide source term model has been developed which simulates the biogeochemical evolution of the Drigg low level waste (LLW) disposal site. The DRINK (DRIgg Near field Kinetic) model provides data regarding radionuclide concentrations in groundwater over a period of 100,000 years, which are used as input to assessment calculations for a groundwater pathway. The DRINK model also provides input to human intrusion and gaseous assessment calculations through simulation of the solid radionuclide inventory. These calculations are being used to support the Drigg post closure safety case. The DRINK model considers the coupled interaction of the effects of fluid flow, microbiology, corrosion, chemical reaction, sorption and radioactive decay. It represents the first direct use of a mechanistic reaction-transport model in risk assessment calculations
Using the Cross-Match Test to Appraise Covariate Balance in Matched Pairs
Having created a tentative matched design for an observational study, diagnostic checks are performed to see whether observed covariates exhibit reasonable balance, or alternatively whether further effort is required to improve the match. We illustrate the use of the cross-match test as an aid to appraising balance on high-dimensional covariates, and we discuss its close logical connections to the techniques used to construct matched samples. In particular, in addition to a significance level, the cross-match test provides an interpretable measure of high-dimensional covariate balance, specifically a measure defined in terms of the propensity score. An example from the economics of education is used to illustrate. In the example, imbalances in an initial match guide the construction of a better match. The better match uses a recently proposed technique, optimal tapered matching, that leaves certain possibly innocuous covariates imbalanced in one match but not in another, and yields a test of whether the imbalances are actually innocuous
Constructed Second Control Groups and Attenuation of Unmeasured Biases
The informal folklore of observational studies claims that if an irrelevant observed covariate is left uncontrolled, say unmatched, then it will influence treatment assignment in haphazard ways, thereby diminishing the biases from unmeasured covariates. We prove a result along these lines: it is true, in a certain sense, to a limited degree, under certain conditions. Alas, the conditions are neither inconsequential nor easy to check in empirical work; indeed, they are often dubious, more often implausible. We suggest the result is most useful in the computerized construction of a second control group, where the investigator can see more in available data without necessarily believing the required conditions. One of the two control groups controls for the possibly irrelevant observed covariate, the other control group either leaves it uncontrolled or forces separation; therefore, the investigator views one situation from two angles under different assumptions. A pair of sensitivity analyses for the two control groups is coordinated by a weighted Holm or recycling procedure built around the possibility of slight attenuation of bias in one control group. Issues are illustrated using an observational study of the possible effects of cigarette smoking as a cause of increased homocysteine levels, a risk factor for cardiovascular disease. Supplementary materials for this article are available online
Clustered Treatment Assignments and Sensitivity to Unmeasured Biases in Observational Studies
Clustered treatment assignment occurs when individuals are grouped into clusters prior to treatment and whole clusters, not individuals, are assigned to treatment or control. In randomized trials, clustered assignments may be required because the treatment must be applied to all children in a classroom, or to all patients at a clinic, or to all radio listeners in the same media market. The most common cluster randomized design pairs 2S clusters into S pairs based on similar pretreatment covariates, then picks one cluster in each pair at random for treatment, the other cluster being assigned to control. Typically, group randomization increases sampling variability and so is less efficient, less powerful, than randomization at the individual level, but it may be unavoidable when it is impractical to treat just a few people within each cluster. Related issues arise in nonrandomized, observational studies of treatment effects, but in this case one must examine the sensitivity of conclusions to bias from nonrandom selection of clusters for treatment. Although clustered assignment increases sampling variability in observational studies, as it does in randomized experiments, it also tends to decrease sensitivity to unmeasured biases, and as the number of cluster pairs increases the latter effect overtakes the former, dominating it when allowance is made for nontrivial biases in treatment assignment. Intuitively, a given magnitude of departure from random assignment can do more harm if it acts on individual students than if it is restricted to act on whole classes, because the bias is unable to pick the strongest individual students for treatment, and this is especially true if a serious effort is made to pair clusters that appeared similar prior to treatment. We examine this issue using an asymptotic measure, the design sensitivity, some inequalities that exploit convexity, simulation, and an application concerned with the flooding of villages in Bangladesh
An Exact Test of Fit for the Gaussian Linear Model using Optimal Nonbipartite Matching
Fisher tested the fit of Gaussian linear models using replicated observations. We refine this method by (1) constructing near-replicates using an optimal nonbipartite matching and (2) defining a distance that focuses on predictors important to the model’s predictions. Near-replicates may not exist unless the predictor set is low-dimensional; the test addresses dimensionality by betting that model failures involve a subset of predictors important in the old fit. Despite using the old fit to pair observations, the test has exactly its stated level under the null hypothesis. Simulations show the test has reasonable power even when many spurious predictors are present
- …