1,511 research outputs found

    Extending Marginal Structural Models through Local, Penalized, and Additive Learning

    Get PDF
    Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a relationship between a treatment and the marginal distribution of a corresponding counterfactual outcome. Following their introduction in Robins (1997), MSMs have typically been fit after assuming a semiparametric model, and then estimating a finite dimensional parameter. van der Laan and Dudoit (2003) proposed to instead view MSM fitting not as a task of semiparametric parameter estimation, but of nonparametric function approximation. They introduced a class of causal effect estimators based on mapping loss functions suitable for the unavailable counterfactual data to those suitable for the data actually observed, and then applying what has been known in nonparametric statistics as empirical risk minimization, or global learning. However, it has long been recognized in the statistical learning community that global learning is only one of several paradigms for estimator construction. Building upon van der Laan and Dudoit\u27s work, we show how marginal structural models for causal effects can be extended through the alternative techniques of local, penalized, and additive learning. We discuss how these new methods can often be implemented by simply adding observation weights to existing algorithms, demonstrate the gains made possible by these extended MSMs through simulation results, and conclude that nonparametric function estimation methods can be fruitfully applied for making causal inferences

    Doubly Robust Censoring Unbiased Transformations

    Get PDF
    We consider random design nonparametric regression when the response variable is subject to right censoring. Following the work of Fan and Gijbels (1994), a common approach to this problem is to apply what has been termed a censoring unbiased transformation to the data to obtain surrogate responses, and then enter these surrogate responses with covariate data into standard smoothing algorithms. Existing censoring unbiased transformations generally depend on either the conditional survival function of the response of interest, or that of the censoring variable. We show that a mapping introduced in another statistical context is in fact a censoring unbiased transformation with a beneficial double robustness property, in that it can be used for nonparametric regression if either of these two conditional distributions are estimated accurately. Advantages of using this transformation for smoothing are illustrated in simulations and on the Stanford heart transplant data. Additionally, we discuss how doubly robust censoring unbiased transformations can be utilized for regression with missing data, in causal inference problems, or with current status dat

    Estimating Function Based Cross-Validation and Learning

    Get PDF
    Suppose that we observe a sample of independent and identically distributed realizations of a random variable. Given a model for the data generating distribution, assume that the parameter of interest can be characterized as the parameter value which makes the population mean of a possibly infinite dimensional estimating function equal to zero. Given a collection of candidate estimators of this parameter, and specification of the vector estimating function, we propose cross-validation criteria for selecting among these estimators. This cross-validation criteria is defined as the Euclidean norm of the empirical mean over the validation sample of the estimating function at the candidate estimator based on the training sample. We establish a finite sample inequality of this method relative to an oracle selector, and illustrate it with some examples. This finite sample inequality provides us with asymptotic equivalence of the selector with the oracle selector under general conditions. We also study the performance of this method in the case that the parameter of interest itself is path-wise differentiable (and thus, in principle, root-nn estimable), and show that the cross-validated selected estimator is typically efficient, and, at certain data generating distributions, superefficient (and thus non-regular). Finally, we combine 1) the selection of sequence of subspaces of the parameter space (i.e., a sieve), 2) the estimating equation as empirical criteria to generate a candidate estimator for each subspace, and 3) estimating function based cross-validation selector to select among the candidate estimators, in order to provide a new unified estimating function based methodology. In particular, we formally establish a finite sample inequality for this general estimator in the case that one uses epsilon-nets as sieve, and point out that this finite sample inequality corresponds with minimax adaptive rates of convergence w.r.t. to the norm implied by the estimating function

    Targeted Maximum Likelihood Learning

    Get PDF
    Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one has available an estimate of the density of the data generating distribution such as a maximum likelihood estimator according to a given or data adaptively selected model. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of the density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating this parameter epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields an efficient or locally efficient estimator of the parameter of interest under regularity conditions. We also show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density. This tool provides us with a new class of targeted likelihood based estimators of pathwise differentiable parameters. We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach.We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. Finally, we show that targeted maximum likelihood estimation can be generalized to estimate any kind of parameter, such as infinite dimensional non-pathwise differentiable parameters, by restricting the likelihood and cross-validated log-likelihood to targeted candidate density estimators only. We illustrate the method with various worked out examples

    A Note on Targeted Maximum Likelihood and Right Censored Data

    Get PDF
    A popular way to estimate an unknown parameter is with substitution, or evaluating the parameter at a likelihood based fit of the data generating density. In many cases, such estimators have substantial bias and can fail to converge at the parametric rate. van der Laan and Rubin (2006) introduced targeted maximum likelihood learning, removing these shackles from substitution estimators, which were made in full agreement with the locally efficient estimating equation procedures as presented in Robins and Rotnitzsky (1992) and van der Laan and Robins (2003). This note illustrates how targeted maximum likelihood can be applied in right censored data structures. In particular, we show that when an initial substitution estimator is based on a Cox proportional hazards model, the targeted likelihood algorithm can be implemented by iteratively adding an appropriate time-dependent covariate

    Doubly Robust Ecological Inference

    Get PDF
    The ecological inference problem is a famous longstanding puzzle that arises in many disciplines. The usual formulation in epidemiology is that we would like to quantify an exposure-disease association by obtaining disease rates among the exposed and unexposed, but only have access to exposure rates and disease rates for several regions. The problem is generally intractable, but can be attacked under the assumptions of King\u27s (1997) extended technique if we can correctly specify a model for a certain conditional distribution. We introduce a procedure that it is a valid approach if either this original model is correct or if we can pose a correct model for a different conditional distribution. The new method is illustrated on data concerning risk factors for diabetes

    Covariate Adjustment for the Intention-to-Treat Parameter with Empirical Efficiency Maximization

    Get PDF
    In randomized experiments, the intention-to-treat parameter is defined as the difference in expected outcomes between groups assigned to treatment and control arms. There is a large literature focusing on how (possibly misspecified) working models can sometimes exploit baseline covariate measurements to gain precision, although covariate adjustment is not strictly necessary. In Rubin and van der Laan (2008), we proposed the technique of empirical efficiency maximization for improving estimation by forming nonstandard fits of such working models. Considering a more realistic randomization scheme than in our original article, we suggest a new class of working models for utilizing covariate information, show our method can be implemented by adding weights to standard regression algorithms, and demonstrate benefits over existing estimators through numerical asymptotic efficiency calculations and simulations

    Empirical Efficiency Maximization

    Get PDF
    It has long been recognized that covariate adjustment can increase precision, even when it is not strictly necessary. The phenomenon is particularly emphasized in clinical trials, whether using continuous, categorical, or censored time-to-event outcomes. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved when modern studies collect copious amounts of baseline information on each subject. The dilemma helped motivate locally efficient estimation for coarsened data structures, as surveyed in the books of van der Laan and Robins (2003) and Tsiatis (2006). Here one fits a relatively small working model for the full data distribution, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator is asymptotically efficient if the working model is correct, but otherwise is still consistent and asymptotically Normal. However, the working model will almost always be misspecified in practice. By applying standard likelihood based fits, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to target the element of a working model minimizing asymptotic variance for the resulting parameter estimate, whether or not the working model is correctly specified. Our procedure is illustrated in three examples. It is shown to be a potentially major improvement over existing covariate adjustment methods for estimating disease prevalence in two-phase epidemiological studies, treatment effects in two-arm randomized trials, and marginal survival curves. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators

    A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting

    Get PDF
    Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the expected number of false positives does not exceed a user-supplied threshold. Among such multiple testing procedures, we derive the ``most powerful\u27\u27 method, meaning the test statistic cutoffs that maximize the expected number of true positives. Unfortunately, these optimal cutoffs depend on the true unknown data generating distribution, so could never be used in a practical setting. We instead consider splitting the sample so that the optimal cutoffs are estimated from a portion of the data, and then testing on the remaining data using these estimated cutoffs. When the null distributions for all test statistics are the same, the obvious way to control the expected number of false positives would be to use a common cutoff for all tests. In this work, we consider the common cutoff method as a benchmark multiple testing procedure. We show that in certain circumstances the use of estimated optimal cutoffs via sample splitting can dramatically outperform this benchmark method, resulting in increased true discoveries, while retaining Type-I error control. This paper is an updated version of the work presented in Rubin et al. (2005), later expanded upon by Wasserman and Roeder (2006)

    Low-ionization Line Emission from Starburst Galaxies: A New Probe of Galactic-Scale Outflows

    Full text link
    We study the kinematically narrow, low-ionization line emission from a bright, starburst galaxy at z = 0.69 using slit spectroscopy obtained with Keck/LRIS. The spectrum reveals strong absorption in MgII and FeII resonance transitions with Doppler shifts of -200 to -300 km/s, indicating a cool gas outflow. Emission in MgII near and redward of systemic velocity, in concert with the observed absorption, yields a P Cygni-like line profile similar to those observed in the Ly alpha transition in Lyman Break Galaxies. Further, the MgII emission is spatially resolved, and extends significantly beyond the emission from stars and HII regions within the galaxy. Assuming the emission has a simple, symmetric surface brightness profile, we find that the gas extends to distances > ~7 kpc. We also detect several narrow FeII* fine-structure lines in emission near the systemic velocity, arising from energy levels which are radiatively excited directly from the ground state. We suggest that the MgII and FeII* emission is generated by photon scattering in the observed outflow, and emphasize that this emission is a generic prediction of outflows. These observations provide the first direct constraints on the minimum spatial extent and morphology of the wind from a distant galaxy. Estimates of these parameters are crucial for understanding the impact of outflows in driving galaxy evolution.Comment: Submitted to ApJL. 6 pages, 4 figures. Uses emulateapj forma
    • …
    corecore