163,383 research outputs found
Reweighted Least Trimmed Squares: An Alternative to One-Step Estimators
A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the ān rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples.asymptotic efficiency;breakdown point;least trimmed squares
Quasi maximum likelihood estimation and prediction in the compound Poisson ECOGARCH(1,1) model
This paper deals with the problem of estimation and prediction in a compound Poisson ECOGARCH(1,1) model. For this we construct a quasi maximum likelihood estimator under the assumption that all jumps of the log-price process are observable. Since these jumps occur at unequally spaced time points, it is clear that the estimator has to be computed for irregularly spaced data. Assuming normally distributed jumps and a recursion to estimate the volatility allows to define and compute a quasi-likelihood function, which is maximised numerically. The small sample behaviour of the estimator is analysed in a small simulation study. Based on the recursion for the volatility process a one-step ahead prediction of the volatility is defined as well as a prediction interval for the log-price process. Finally the model is fitted to tick-by-tick data of the New York Stock Exchange
-confidence sets in high-dimensional regression
We study a high-dimensional regression model. Aim is to construct a
confidence set for a given group of regression coefficients, treating all other
regression coefficients as nuisance parameters. We apply a one-step procedure
with the square-root Lasso as initial estimator and a multivariate square-root
Lasso for constructing a surrogate Fisher information matrix. The multivariate
square-root Lasso is based on nuclear norm loss with -penalty. We show
that this procedure leads to an asymptotically -distributed pivot, with
a remainder term depending only on the -error of the initial estimator.
We show that under -sparsity conditions on the regression coefficients
the square-root Lasso produces to a consistent estimator of the noise
variance and we establish sharp oracle inequalities which show that the
remainder term is small under further sparsity conditions on and
compatibility conditions on the design.Comment: 22 page
Learning-Based Distributed Detection-Estimation in Sensor Networks with Unknown Sensor Defects
We consider the problem of distributed estimation of an unknown deterministic
scalar parameter (the target signal) in a wireless sensor network (WSN), where
each sensor receives a single snapshot of the field. We assume that the
observation at each node randomly falls into one of two modes: a valid or an
invalid observation mode. Specifically, mode one corresponds to the desired
signal plus noise observation mode (\emph{valid}), and mode two corresponds to
the pure noise mode (\emph{invalid}) due to node defect or damage. With no
prior information on such local sensing modes, we introduce a learning-based
distributed procedure, called the mixed detection-estimation (MDE) algorithm,
based on iterative closed-loop interactions between mode learning (detection)
and target estimation. The online learning step re-assesses the validity of the
local observations at each iteration, thus refining the ongoing estimation
update process. The convergence of the MDE algorithm is established
analytically. Asymptotic analysis shows that, in the high signal-to-noise ratio
(SNR) regime, the MDE estimation error converges to that of an ideal
(centralized) estimator with perfect information about the node sensing modes.
This is in contrast to the estimation performance of a naive average consensus
based distributed estimator (without mode learning), whose estimation error
blows up with an increasing SNR.Comment: 15 pages, 2 figures, submitted to TS
Online Targeted Learning
We consider the case that the data comes in sequentially and can be viewed as sample of independent and identically distributed observations from a fixed data generating distribution. The goal is to estimate a particular path wise target parameter of this data generating distribution that is known to be an element of a particular semi-parametric statistical model. We want our estimator to be asymptotically efficient, but we also want that our estimator can be calculated by updating the current estimator based on the new block of data without having to revisit the past data, so that it is computationally much faster to compute than recomputing a fixed estimator each time new data comes in. We refer to such an estimator as an online estimator. These online estimators can also be applied on a large fixed data base by dividing the data set in many subsets and enforcing an ordering of these subsets. The current literature provides such online estimators for parametric models, where the online estimators are based on variations of the stochastic gradient descent algorithm.
For that purpose we propose a new online one-step estimator, which is proven to be asymptotically efficient under regularity conditions. This estimator takes as input online estimators of the relevant part of the data generating distribution and the nuisance parameter that are required for efficient estimation of the target parameter. These estimators could be an online stochastic gradient descent estimator based on large parametric models as developed in the current literature, but we also propose other online data adaptive estimators that do not rely on the specification of a particular parametric model.
We also present a targeted version of this online one-step estimator that presumably minimizes the one-step correction and thereby might be more robust in finite samples. These online one-step estimators are not a substitution estimator and might therefore be unstable for finite samples if the target parameter is borderline identifiable.
Therefore we also develop an online targeted minimum loss-based estimator, which updates the initial estimator of the relevant part of the data generating distribution by updating the current initial estimator with the new block of data, and estimates the target parameter with the corresponding plug-in estimator. The online substitution estimator is also proven to be asymptotically efficient under the same regularity conditions required for asymptotic normality of the online one-step estimator.
The online one-step estimator, targeted online one-step estimator, and online TMLE is demonstrated for estimation of a causal effect of a binary treatment on an outcome based on a dynamic data base that gets regularly updated, a common scenario for the analysis of electronic medical record data bases.
Finally, we extend these online estimators to a group sequential adaptive design in which certain components of the data generating experiment are continuously fine-tuned based on past data, and the new data generating distribution is then used to generate the next block of data
Reweighted Least Trimmed Squares:An Alternative to One-Step Estimators
A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the ān rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples.
Reweighted Least Trimmed Squares:An Alternative to One-Step Estimators
A new class of robust regression estimators is proposed that forms an alternative to traditional robust one-step estimators and that achieves the ān rate of convergence irrespective of the initial estimator under a wide range of distributional assumptions. The proposed reweighted least trimmed squares (RLTS) estimator employs data-dependent weights determined from an initial robust fit. Just like many existing one- and two-step robust methods, the RLTS estimator preserves robust properties of the initial robust estimate. However contrary to existing methods, the first-order asymptotic behavior of RLTS is independent of the initial estimate even if errors exhibit heteroscedasticity, asymmetry, or serial correlation. Moreover, we derive the asymptotic distribution of RLTS and show that it is asymptotically efficient for normally distributed errors. A simulation study documents benefits of these theoretical properties in finite samples
Distributed Estimation and Inference for Spatial Autoregression Model with Large Scale Networks
The rapid growth of online network platforms generates large-scale network
data and it poses great challenges for statistical analysis using the spatial
autoregression (SAR) model. In this work, we develop a novel distributed
estimation and statistical inference framework for the SAR model on a
distributed system. We first propose a distributed network least squares
approximation (DNLSA) method. This enables us to obtain a one-step estimator by
taking a weighted average of local estimators on each worker. Afterwards, a
refined two-step estimation is designed to further reduce the estimation bias.
For statistical inference, we utilize a random projection method to reduce the
expensive communication cost. Theoretically, we show the consistency and
asymptotic normality of both the one-step and two-step estimators. In addition,
we provide theoretical guarantee of the distributed statistical inference
procedure. The theoretical findings and computational advantages are validated
by several numerical simulations implemented on the Spark system. Lastly, an
experiment on the Yelp dataset further illustrates the usefulness of the
proposed methodology
- ā¦