4,799 research outputs found
Mechanistic Behavior of Single-Pass Instruction Sequences
Earlier work on program and thread algebra detailed the functional,
observable behavior of programs under execution. In this article we add the
modeling of unobservable, mechanistic processing, in particular processing due
to jump instructions. We model mechanistic processing preceding some further
behavior as a delay of that behavior; we borrow a unary delay operator from
discrete time process algebra. We define a mechanistic improvement ordering on
threads and observe that some threads do not have an optimal implementation.Comment: 12 page
Tuplix Calculus
We introduce a calculus for tuplices, which are expressions that generalize
matrices and vectors. Tuplices have an underlying data type for quantities that
are taken from a zero-totalized field. We start with the core tuplix calculus
CTC for entries and tests, which are combined using conjunctive composition. We
define a standard model and prove that CTC is relatively complete with respect
to it. The core calculus is extended with operators for choice, information
hiding, scalar multiplication, clearing and encapsulation. We provide two
examples of applications; one on incremental financial budgeting, and one on
modular financial budget design.Comment: 22 page
A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure
We often seek to estimate the impact of an exposure naturally occurring or
randomly assigned at the cluster-level. For example, the literature on
neighborhood determinants of health continues to grow. Likewise, community
randomized trials are applied to learn about real-world implementation,
sustainability, and population effects of interventions with proven
individual-level efficacy. In these settings, individual-level outcomes are
correlated due to shared cluster-level factors, including the exposure, as well
as social or biological interactions between individuals. To flexibly and
efficiently estimate the effect of a cluster-level exposure, we present two
targeted maximum likelihood estimators (TMLEs). The first TMLE is developed
under a non-parametric causal model, which allows for arbitrary interactions
between individuals within a cluster. These interactions include direct
transmission of the outcome (i.e. contagion) and influence of one individual's
covariates on another's outcome (i.e. covariate interference). The second TMLE
is developed under a causal sub-model assuming the cluster-level and
individual-specific covariates are sufficient to control for confounding.
Simulations compare the alternative estimators and illustrate the potential
gains from pairing individual-level risk factors and outcomes during
estimation, while avoiding unwarranted assumptions. Our results suggest that
estimation under the sub-model can result in bias and misleading inference in
an observational setting. Incorporating working assumptions during estimation
is more robust than assuming they hold in the underlying causal model. We
illustrate our approach with an application to HIV prevention and treatment
Covariate Adjustment for the Intention-to-Treat Parameter with Empirical Efficiency Maximization
In randomized experiments, the intention-to-treat parameter is defined as the difference in expected outcomes between groups assigned to treatment and control arms. There is a large literature focusing on how (possibly misspecified) working models can sometimes exploit baseline covariate measurements to gain precision, although covariate adjustment is not strictly necessary. In Rubin and van der Laan (2008), we proposed the technique of empirical efficiency maximization for improving estimation by forming nonstandard fits of such working models. Considering a more realistic randomization scheme than in our original article, we suggest a new class of working models for utilizing covariate information, show our method can be implemented by adding weights to standard regression algorithms, and demonstrate benefits over existing estimators through numerical asymptotic efficiency calculations and simulations
Estimating Effects on Rare Outcomes: Knowledge is Power
Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides stability and power to estimate the exposure effect. In finite sample simulations, the proposed estimator performed as well, if not better, than alternative estimators, including the propensity score matching estimator, inverse probability of treatment weighted (IPTW) estimator, augmented-IPTW and the standard TMLE algorithm. The new estimator remained unbiased if either the conditional mean outcome or the propensity score were consistently estimated. As a substitution estimator, TMLE guaranteed the point estimates were within the parameter range. Our results highlight the potential for double robust, semiparametric efficient estimation with rare event
Empirical Efficiency Maximization
It has long been recognized that covariate adjustment can increase precision, even when it is not strictly necessary. The phenomenon is particularly emphasized in clinical trials, whether using continuous, categorical, or censored time-to-event outcomes. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved when modern studies collect copious amounts of baseline information on each subject.
The dilemma helped motivate locally efficient estimation for coarsened data structures, as surveyed in the books of van der Laan and Robins (2003) and Tsiatis (2006). Here one fits a relatively small working model for the full data distribution, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator is asymptotically efficient if the working model is correct, but otherwise is still consistent and asymptotically Normal.
However, the working model will almost always be misspecified in practice. By applying standard likelihood based fits, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to target the element of a working model minimizing asymptotic variance for the resulting parameter estimate, whether or not the working model is correctly specified.
Our procedure is illustrated in three examples. It is shown to be a potentially major improvement over existing covariate adjustment methods for estimating disease prevalence in two-phase epidemiological studies, treatment effects in two-arm randomized trials, and marginal survival curves. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators
Doubly Robust Ecological Inference
The ecological inference problem is a famous longstanding puzzle that arises in many disciplines. The usual formulation in epidemiology is that we would like to quantify an exposure-disease association by obtaining disease rates among the exposed and unexposed, but only have access to exposure rates and disease rates for several regions. The problem is generally intractable, but can be attacked under the assumptions of King\u27s (1997) extended technique if we can correctly specify a model for a certain conditional distribution. We introduce a procedure that it is a valid approach if either this original model is correct or if we can pose a correct model for a different conditional distribution. The new method is illustrated on data concerning risk factors for diabetes
- …