Search CORE

163,383 research outputs found

Reweighted Least Trimmed Squares: An Alternative to One-Step Estimators

Author: Cizek P.
Publication venue
Publication date
Field of study

A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the √n rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples.asymptotic efficiency;breakdown point;least trimmed squares

Research Papers in Economics

Quasi maximum likelihood estimation and prediction in the compound Poisson ECOGARCH(1,1) model

Author: Czado Claudia
Haug Stephan
Publication venue
Publication date: 01/01/2006
Field of study

This paper deals with the problem of estimation and prediction in a compound Poisson ECOGARCH(1,1) model. For this we construct a quasi maximum likelihood estimator under the assumption that all jumps of the log-price process are observable. Since these jumps occur at unequally spaced time points, it is clear that the estimator has to be computed for irregularly spaced data. Assuming normally distributed jumps and a recursion to estimate the volatility allows to define and compute a quasi-likelihood function, which is maximised numerically. The small sample behaviour of the estimator is analysed in a small simulation study. Based on the recursion for the volatility process a one-step ahead prediction of the volatility is defined as well as a prediction interval for the log-price process. Finally the model is fitted to tick-by-tick data of the New York Stock Exchange

Open Access LMU

$\chi^2$ -confidence sets in high-dimensional regression

Author: Stucky Benjamin
van de Geer Sara
Publication venue
Publication date: 15/09/2015
Field of study

We study a high-dimensional regression model. Aim is to construct a confidence set for a given group of regression coefficients, treating all other regression coefficients as nuisance parameters. We apply a one-step procedure with the square-root Lasso as initial estimator and a multivariate square-root Lasso for constructing a surrogate Fisher information matrix. The multivariate square-root Lasso is based on nuclear norm loss with

\ell_1

-penalty. We show that this procedure leads to an asymptotically

\chi^2

-distributed pivot, with a remainder term depending only on the

\ell_1

-error of the initial estimator. We show that under

\ell_1

-sparsity conditions on the regression coefficients

\beta^0

the square-root Lasso produces to a consistent estimator of the noise variance and we establish sharp oracle inequalities which show that the remainder term is small under further sparsity conditions on

\beta^0

and compatibility conditions on the design.Comment: 22 page

arXiv.org e-Print Archive

Crossref

Learning-Based Distributed Detection-Estimation in Sensor Networks with Unknown Sensor Defects

Author: Cui Shuguang
Huie Lauren
Kar Soummya
Li Di
Poor H. Vincent
Zhou Qing
Publication venue
Publication date: 08/10/2015
Field of study

We consider the problem of distributed estimation of an unknown deterministic scalar parameter (the target signal) in a wireless sensor network (WSN), where each sensor receives a single snapshot of the field. We assume that the observation at each node randomly falls into one of two modes: a valid or an invalid observation mode. Specifically, mode one corresponds to the desired signal plus noise observation mode (\emph{valid}), and mode two corresponds to the pure noise mode (\emph{invalid}) due to node defect or damage. With no prior information on such local sensing modes, we introduce a learning-based distributed procedure, called the mixed detection-estimation (MDE) algorithm, based on iterative closed-loop interactions between mode learning (detection) and target estimation. The online learning step re-assesses the validity of the local observations at each iteration, thus refining the ongoing estimation update process. The convergence of the MDE algorithm is established analytically. Asymptotic analysis shows that, in the high signal-to-noise ratio (SNR) regime, the MDE estimation error converges to that of an ideal (centralized) estimator with perfect information about the node sensing modes. This is in contrast to the estimation performance of a naive average consensus based distributed estimator (without mode learning), whose estimation error blows up with an increasing SNR.Comment: 15 pages, 2 figures, submitted to TS

arXiv.org e-Print Archive

Princeton University Open Access Repository

Online Targeted Learning

Author: Lendle Samuel D
van der Laan Mark J.
Publication venue: Collection of Biostatistics Research Archive
Publication date: 23/09/2014
Field of study

We consider the case that the data comes in sequentially and can be viewed as sample of independent and identically distributed observations from a fixed data generating distribution. The goal is to estimate a particular path wise target parameter of this data generating distribution that is known to be an element of a particular semi-parametric statistical model. We want our estimator to be asymptotically efficient, but we also want that our estimator can be calculated by updating the current estimator based on the new block of data without having to revisit the past data, so that it is computationally much faster to compute than recomputing a fixed estimator each time new data comes in. We refer to such an estimator as an online estimator. These online estimators can also be applied on a large fixed data base by dividing the data set in many subsets and enforcing an ordering of these subsets. The current literature provides such online estimators for parametric models, where the online estimators are based on variations of the stochastic gradient descent algorithm. For that purpose we propose a new online one-step estimator, which is proven to be asymptotically efficient under regularity conditions. This estimator takes as input online estimators of the relevant part of the data generating distribution and the nuisance parameter that are required for efficient estimation of the target parameter. These estimators could be an online stochastic gradient descent estimator based on large parametric models as developed in the current literature, but we also propose other online data adaptive estimators that do not rely on the specification of a particular parametric model. We also present a targeted version of this online one-step estimator that presumably minimizes the one-step correction and thereby might be more robust in finite samples. These online one-step estimators are not a substitution estimator and might therefore be unstable for finite samples if the target parameter is borderline identifiable. Therefore we also develop an online targeted minimum loss-based estimator, which updates the initial estimator of the relevant part of the data generating distribution by updating the current initial estimator with the new block of data, and estimates the target parameter with the corresponding plug-in estimator. The online substitution estimator is also proven to be asymptotically efficient under the same regularity conditions required for asymptotic normality of the online one-step estimator. The online one-step estimator, targeted online one-step estimator, and online TMLE is demonstrated for estimation of a causal effect of a binary treatment on an outcome based on a dynamic data base that gets regularly updated, a common scenario for the analysis of electronic medical record data bases. Finally, we extend these online estimators to a group sequential adaptive design in which certain components of the data generating experiment are continuously fine-tuned based on past data, and the new data generating distribution is then used to generate the next block of data

CiteSeerX

Collection Of Biostatistics Research Archive

Reweighted Least Trimmed Squares:An Alternative to One-Step Estimators

Author: Cizek P.
Publication venue: 'Faculty of Mathematics, Computer Science and Econometrics, University of Zielona Gora'
Publication date: 01/01/2010
Field of study

Tilburg University Repository

Reweighted Least Trimmed Squares:An Alternative to One-Step Estimators

Author: Cizek P.
Publication venue: 'Faculty of Mathematics, Computer Science and Econometrics, University of Zielona Gora'
Publication date: 01/01/2010
Field of study

Tilburg University Repository

Distributed Estimation and Inference for Spatial Autoregression Model with Large Scale Networks

Author: Gao Yuan
Li Zhe
Ren Yimeng
Wang Hansheng
Zhu Xuening
Publication venue
Publication date: 27/11/2023
Field of study

The rapid growth of online network platforms generates large-scale network data and it poses great challenges for statistical analysis using the spatial autoregression (SAR) model. In this work, we develop a novel distributed estimation and statistical inference framework for the SAR model on a distributed system. We first propose a distributed network least squares approximation (DNLSA) method. This enables us to obtain a one-step estimator by taking a weighted average of local estimators on each worker. Afterwards, a refined two-step estimation is designed to further reduce the estimation bias. For statistical inference, we utilize a random projection method to reduce the expensive communication cost. Theoretically, we show the consistency and asymptotic normality of both the one-step and two-step estimators. In addition, we provide theoretical guarantee of the distributed statistical inference procedure. The theoretical findings and computational advantages are validated by several numerical simulations implemented on the Spark system. Lastly, an experiment on the Yelp dataset further illustrates the usefulness of the proposed methodology

arXiv.org e-Print Archive