292 research outputs found
Discussion: The Dantzig selector: Statistical estimation when is much larger than
Discussion of "The Dantzig selector: Statistical estimation when is much
larger than " [math/0506081]Comment: Published in at http://dx.doi.org/10.1214/009053607000000424 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Discussion of: Treelets--An adaptive multi-scale basis for sparse unordered data
Discussion of "Treelets--An adaptive multi-scale basis for sparse unordered
data" [arXiv:0707.0481]Comment: Published in at http://dx.doi.org/10.1214/08-AOAS137B the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Covariance regularization by thresholding
This paper considers regularizing a covariance matrix of variables
estimated from observations, by hard thresholding. We show that the
thresholded estimate is consistent in the operator norm as long as the true
covariance matrix is sparse in a suitable sense, the variables are Gaussian or
sub-Gaussian, and , and obtain explicit rates. The results are
uniform over families of covariance matrices which satisfy a fairly natural
notion of sparsity. We discuss an intuitive resampling scheme for threshold
selection and prove a general cross-validation result that justifies this
approach. We also compare thresholding to other covariance estimators in
simulations and on an example from climate data.Comment: Published in at http://dx.doi.org/10.1214/08-AOS600 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Large Vector Auto Regressions
One popular approach for nonstructural economic and financial forecasting is
to include a large number of economic and financial variables, which has been
shown to lead to significant improvements for forecasting, for example, by the
dynamic factor models. A challenging issue is to determine which variables and
(their) lags are relevant, especially when there is a mixture of serial
correlation (temporal dynamics), high dimensional (spatial) dependence
structure and moderate sample size (relative to dimensionality and lags). To
this end, an \textit{integrated} solution that addresses these three challenges
simultaneously is appealing. We study the large vector auto regressions here
with three types of estimates. We treat each variable's own lags different from
other variables' lags, distinguish various lags over time, and is able to
select the variables and lags simultaneously. We first show the consequences of
using Lasso type estimate directly for time series without considering the
temporal dependence. In contrast, our proposed method can still produce an
estimate as efficient as an \textit{oracle} under such scenarios. The tuning
parameters are chosen via a data driven "rolling scheme" method to optimize the
forecasting performance. A macroeconomic and financial forecasting problem is
considered to illustrate its superiority over existing estimators
Efficient independent component analysis
Independent component analysis (ICA) has been widely used for blind source
separation in many fields such as brain imaging analysis, signal processing and
telecommunication. Many statistical techniques based on M-estimates have been
proposed for estimating the mixing matrix. Recently, several nonparametric
methods have been developed, but in-depth analysis of asymptotic efficiency has
not been available. We analyze ICA using semiparametric theories and propose a
straightforward estimate based on the efficient score function by using
B-spline approximations. The estimate is asymptotically efficient under
moderate conditions and exhibits better performance than standard ICA methods
in a variety of simulations.Comment: Published at http://dx.doi.org/10.1214/009053606000000939 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …