82 research outputs found
Optimal variance estimation without estimating the mean function
We study the least squares estimator in the residual variance estimation
context. We show that the mean squared differences of paired observations are
asymptotically normally distributed. We further establish that, by regressing
the mean squared differences of these paired observations on the squared
distances between paired covariates via a simple least squares procedure, the
resulting variance estimator is not only asymptotically normal and root-
consistent, but also reaches the optimal bound in terms of estimation variance.
We also demonstrate the advantage of the least squares estimator in comparison
with existing methods in terms of the second order asymptotic properties.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ432 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
NBLDA: Negative Binomial Linear Discriminant Analysis for RNA-Seq Data
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize
gene expression profiles because it is more accurate and comprehensive than
microarrays. Although statistical methods that have been developed for
microarray data can be applied to RNA-Seq data, they are not ideal due to the
discrete nature of RNA-Seq data. The Poisson distribution and negative binomial
distribution are commonly used to model count data. Recently, Witten (2011)
proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson
assumption may not be as appropriate as negative binomial distribution when
biological replicates are available and in the presence of overdispersion
(i.e., when the variance is larger than the mean). However, it is more
complicated to model negative binomial variables because they involve a
dispersion parameter that needs to be estimated. In this paper, we propose a
negative binomial linear discriminant analysis for RNA-Seq data. By Bayes'
rule, we construct the classifier by fitting a negative binomial model, and
propose some plug-in rules to estimate the unknown parameters in the
classifier. The relationship between the negative binomial classifier and the
Poisson classifier is explored, with a numerical investigation of the impact of
dispersion on the discriminant score. Simulation results show the superiority
of our proposed method. We also analyze four real RNA-Seq data sets to
demonstrate the advantage of our method in real-world applications
Optimal- difference sequence in nonparametric regression
Difference-based methods have been attracting increasing attention in
nonparametric regression, in particular for estimating the residual variance.To
implement the estimation, one needs to choose an appropriate difference
sequence, mainly between {\em the optimal difference sequence} and {\em the
ordinary difference sequence}. The difference sequence selection is a
fundamental problem in nonparametric regression, and it remains a controversial
issue for over three decades. In this paper, we propose to tackle this
challenging issue from a very unique perspective, namely by introducing a new
difference sequence called {\em the optimal- difference sequence}. The new
difference sequence not only provides a better balance between the
bias-variance trade-off, but also dramatically enlarges the existing family of
difference sequences that includes the optimal and ordinary difference
sequences as two important special cases. We further demonstrate, by both
theoretical and numerical studies, that the optimal- difference sequence has
been pushing the boundaries of our knowledge in difference-based methods in
nonparametric regression, and it always performs the best in practical
situations
- …