158,555 research outputs found

    Reweighted Least Trimmed Squares: An Alternative to One-Step Estimators

    Get PDF
    A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the āˆšn rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples.asymptotic efficiency;breakdown point;least trimmed squares

    Quasi maximum likelihood estimation and prediction in the compound Poisson ECOGARCH(1,1) model

    Get PDF
    This paper deals with the problem of estimation and prediction in a compound Poisson ECOGARCH(1,1) model. For this we construct a quasi maximum likelihood estimator under the assumption that all jumps of the log-price process are observable. Since these jumps occur at unequally spaced time points, it is clear that the estimator has to be computed for irregularly spaced data. Assuming normally distributed jumps and a recursion to estimate the volatility allows to define and compute a quasi-likelihood function, which is maximised numerically. The small sample behaviour of the estimator is analysed in a small simulation study. Based on the recursion for the volatility process a one-step ahead prediction of the volatility is defined as well as a prediction interval for the log-price process. Finally the model is fitted to tick-by-tick data of the New York Stock Exchange

    Ļ‡2\chi^2-confidence sets in high-dimensional regression

    Full text link
    We study a high-dimensional regression model. Aim is to construct a confidence set for a given group of regression coefficients, treating all other regression coefficients as nuisance parameters. We apply a one-step procedure with the square-root Lasso as initial estimator and a multivariate square-root Lasso for constructing a surrogate Fisher information matrix. The multivariate square-root Lasso is based on nuclear norm loss with ā„“1\ell_1-penalty. We show that this procedure leads to an asymptotically Ļ‡2\chi^2-distributed pivot, with a remainder term depending only on the ā„“1\ell_1-error of the initial estimator. We show that under ā„“1\ell_1-sparsity conditions on the regression coefficients Ī²0\beta^0 the square-root Lasso produces to a consistent estimator of the noise variance and we establish sharp oracle inequalities which show that the remainder term is small under further sparsity conditions on Ī²0\beta^0 and compatibility conditions on the design.Comment: 22 page

    Online Targeted Learning

    Get PDF
    We consider the case that the data comes in sequentially and can be viewed as sample of independent and identically distributed observations from a fixed data generating distribution. The goal is to estimate a particular path wise target parameter of this data generating distribution that is known to be an element of a particular semi-parametric statistical model. We want our estimator to be asymptotically efficient, but we also want that our estimator can be calculated by updating the current estimator based on the new block of data without having to revisit the past data, so that it is computationally much faster to compute than recomputing a fixed estimator each time new data comes in. We refer to such an estimator as an online estimator. These online estimators can also be applied on a large fixed data base by dividing the data set in many subsets and enforcing an ordering of these subsets. The current literature provides such online estimators for parametric models, where the online estimators are based on variations of the stochastic gradient descent algorithm. For that purpose we propose a new online one-step estimator, which is proven to be asymptotically efficient under regularity conditions. This estimator takes as input online estimators of the relevant part of the data generating distribution and the nuisance parameter that are required for efficient estimation of the target parameter. These estimators could be an online stochastic gradient descent estimator based on large parametric models as developed in the current literature, but we also propose other online data adaptive estimators that do not rely on the specification of a particular parametric model. We also present a targeted version of this online one-step estimator that presumably minimizes the one-step correction and thereby might be more robust in finite samples. These online one-step estimators are not a substitution estimator and might therefore be unstable for finite samples if the target parameter is borderline identifiable. Therefore we also develop an online targeted minimum loss-based estimator, which updates the initial estimator of the relevant part of the data generating distribution by updating the current initial estimator with the new block of data, and estimates the target parameter with the corresponding plug-in estimator. The online substitution estimator is also proven to be asymptotically efficient under the same regularity conditions required for asymptotic normality of the online one-step estimator. The online one-step estimator, targeted online one-step estimator, and online TMLE is demonstrated for estimation of a causal effect of a binary treatment on an outcome based on a dynamic data base that gets regularly updated, a common scenario for the analysis of electronic medical record data bases. Finally, we extend these online estimators to a group sequential adaptive design in which certain components of the data generating experiment are continuously fine-tuned based on past data, and the new data generating distribution is then used to generate the next block of data

    Reweighted Least Trimmed Squares:An Alternative to One-Step Estimators

    Get PDF
    A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the āˆšn rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples.

    Targeted Maximum Likelihood Learning

    Get PDF
    Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one has available an estimate of the density of the data generating distribution such as a maximum likelihood estimator according to a given or data adaptively selected model. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of the density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating this parameter epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields an efficient or locally efficient estimator of the parameter of interest under regularity conditions. We also show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density. This tool provides us with a new class of targeted likelihood based estimators of pathwise differentiable parameters. We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach.We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. Finally, we show that targeted maximum likelihood estimation can be generalized to estimate any kind of parameter, such as infinite dimensional non-pathwise differentiable parameters, by restricting the likelihood and cross-validated log-likelihood to targeted candidate density estimators only. We illustrate the method with various worked out examples

    Efficient Estimation of Copula-based Semiparametric Markov Models

    Get PDF
    This paper considers efficient estimation of copula-based semiparametric strictly stationary Markov models. These models are characterized by nonparametric invariant (one-dimensional marginal) distributions and parametric bivariate copula functions; where the copulas capture temporal dependence and tail dependence of the processes. The Markov processes generated via tail dependent copulas may look highly persistent and are useful for financial and economic applications. We first show that Markov processes generated via Clayton, Gumbel and Student's tt copulas and their survival copulas are all geometrically ergodic. We then propose a sieve maximum likelihood estimation (MLE) for the copula parameter, the invariant distribution and the conditional quantiles. We show that the sieve MLEs of any smooth functionals are root-nn consistent, asymptotically normal and efficient; and that their sieve likelihood ratio statistics are asymptotically chi-square distributed. We present Monte Carlo studies to compare the finite sample performance of the sieve MLE, the two-step estimator of Chen and Fan (2006), the correctly specified parametric MLE and the incorrectly specified parametric MLE. The simulation results indicate that our sieve MLEs perform very well; having much smaller biases and smaller variances than the two-step estimator for Markov models generated via Clayton, Gumbel and other tail dependent copulas.Copula, Tail dependence, Nonlinear Markov models, Geometric ergodicity, Sieve MLE, Semiparametric efficiency, Sieve likelihood ratio statistics, Value-at-Risk
    • ā€¦
    corecore