3,187 research outputs found
Gaussian Graphical Model Estimation with False Discovery Rate Control
This paper studies the estimation of high dimensional Gaussian graphical
model (GGM). Typically, the existing methods depend on regularization
techniques. As a result, it is necessary to choose the regularized parameter.
However, the precise relationship between the regularized parameter and the
number of false edges in GGM estimation is unclear. Hence, it is impossible to
evaluate their performance rigorously. In this paper, we propose an alternative
method by a multiple testing procedure. Based on our new test statistics for
conditional dependence, we propose a simultaneous testing procedure for
conditional dependence in GGM. Our method can control the false discovery rate
(FDR) asymptotically. The numerical performance of the proposed method shows
that our method works quite well
A Direct Estimation Approach to Sparse Linear Discriminant Analysis
This paper considers sparse linear discriminant analysis of high-dimensional
data. In contrast to the existing methods which are based on separate
estimation of the precision matrix \O and the difference \de of the mean
vectors, we introduce a simple and effective classifier by estimating the
product \O\de directly through constrained minimization. The
estimator can be implemented efficiently using linear programming and the
resulting classifier is called the linear programming discriminant (LPD) rule.
The LPD rule is shown to have desirable theoretical and numerical properties.
It exploits the approximate sparsity of \O\de and as a consequence allows
cases where it can still perform well even when \O and/or \de cannot be
estimated consistently. Asymptotic properties of the LPD rule are investigated
and consistency and rate of convergence results are given. The LPD classifier
has superior finite sample performance and significant computational advantages
over the existing methods that require separate estimation of \O and \de.
The LPD rule is also applied to analyze real datasets from lung cancer and
leukemia studies. The classifier performs favorably in comparison to existing
methods.Comment: 39 pages.To appear in Journal of the American Statistical Associatio
Simultaneous nonparametric inference of time series
We consider kernel estimation of marginal densities and regression functions
of stationary processes. It is shown that for a wide class of time series, with
proper centering and scaling, the maximum deviations of kernel density and
regression estimates are asymptotically Gumbel. Our results substantially
generalize earlier ones which were obtained under independence or beta mixing
assumptions. The asymptotic results can be applied to assess patterns of
marginal densities or regression functions via the construction of simultaneous
confidence bands for which one can perform goodness-of-fit tests. As an
application, we construct simultaneous confidence bands for drift and
volatility functions in a dynamic short-term rate model for the U.S. Treasury
yield curve rates data.Comment: Published in at http://dx.doi.org/10.1214/09-AOS789 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Self-normalized Cram\'{e}r type moderate deviations for the maximum of sums
Let be independent random variables with zero means and finite
variances, and let and . A
Cram\'{e}r type moderate deviation for the maximum of the self-normalized sums
is obtained. In particular, for identically
distributed it is proved that uniformly for
under the optimal finite third moment of .Comment: Published in at http://dx.doi.org/10.3150/12-BEJ415 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Adaptive Thresholding for Sparse Covariance Matrix Estimation
In this paper we consider estimation of sparse covariance matrices and
propose a thresholding procedure which is adaptive to the variability of
individual entries. The estimators are fully data driven and enjoy excellent
performance both theoretically and numerically. It is shown that the estimators
adaptively achieve the optimal rate of convergence over a large class of sparse
covariance matrices under the spectral norm. In contrast, the commonly used
universal thresholding estimators are shown to be sub-optimal over the same
parameter spaces. Support recovery is also discussed. The adaptive thresholding
estimators are easy to implement. Numerical performance of the estimators is
studied using both simulated and real data. Simulation results show that the
adaptive thresholding estimators uniformly outperform the universal
thresholding estimators. The method is also illustrated in an analysis on a
dataset from a small round blue-cell tumors microarray experiment. A supplement
to this paper which contains additional technical proofs is available online.Comment: To appear in Journal of the American Statistical Associatio
- β¦