4,801 research outputs found
NBLDA: Negative Binomial Linear Discriminant Analysis for RNA-Seq Data
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize
gene expression profiles because it is more accurate and comprehensive than
microarrays. Although statistical methods that have been developed for
microarray data can be applied to RNA-Seq data, they are not ideal due to the
discrete nature of RNA-Seq data. The Poisson distribution and negative binomial
distribution are commonly used to model count data. Recently, Witten (2011)
proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson
assumption may not be as appropriate as negative binomial distribution when
biological replicates are available and in the presence of overdispersion
(i.e., when the variance is larger than the mean). However, it is more
complicated to model negative binomial variables because they involve a
dispersion parameter that needs to be estimated. In this paper, we propose a
negative binomial linear discriminant analysis for RNA-Seq data. By Bayes'
rule, we construct the classifier by fitting a negative binomial model, and
propose some plug-in rules to estimate the unknown parameters in the
classifier. The relationship between the negative binomial classifier and the
Poisson classifier is explored, with a numerical investigation of the impact of
dispersion on the discriminant score. Simulation results show the superiority
of our proposed method. We also analyze four real RNA-Seq data sets to
demonstrate the advantage of our method in real-world applications
Multi-level Feature Fusion-based CNN for Local Climate Zone Classification from Sentinel-2 Images: Benchmark Results on the So2Sat LCZ42 Dataset
As a unique classification scheme for urban forms and functions, the local
climate zone (LCZ) system provides essential general information for any
studies related to urban environments, especially on a large scale. Remote
sensing data-based classification approaches are the key to large-scale mapping
and monitoring of LCZs. The potential of deep learning-based approaches is not
yet fully explored, even though advanced convolutional neural networks (CNNs)
continue to push the frontiers for various computer vision tasks. One reason is
that published studies are based on different datasets, usually at a regional
scale, which makes it impossible to fairly and consistently compare the
potential of different CNNs for real-world scenarios. This study is based on
the big So2Sat LCZ42 benchmark dataset dedicated to LCZ classification. Using
this dataset, we studied a range of CNNs of varying sizes. In addition, we
proposed a CNN to classify LCZs from Sentinel-2 images, Sen2LCZ-Net. Using this
base network, we propose fusing multi-level features using the extended
Sen2LCZ-Net-MF. With this proposed simple network architecture and the highly
competitive benchmark dataset, we obtain results that are better than those
obtained by the state-of-the-art CNNs, while requiring less computation with
fewer layers and parameters. Large-scale LCZ classification examples of
completely unseen areas are presented, demonstrating the potential of our
proposed Sen2LCZ-Net-MF as well as the So2Sat LCZ42 dataset. We also
intensively investigated the influence of network depth and width and the
effectiveness of the design choices made for Sen2LCZ-Net-MF. Our work will
provide important baselines for future CNN-based algorithm developments for
both LCZ classification and other urban land cover land use classification
- …