76,373 research outputs found
Partial Least Squares: A Versatile Tool for the Analysis of High-Dimensional Genomic Data
Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of high-dimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray expression data we provide a systematic comparison of the PLS approaches currently employed, and discuss problems as different as tumor classification, identification of relevant genes, survival analysis and modeling of gene networks
Sequential Logistic Principal Component Analysis (SLPCA): Dimensional Reduction in Streaming Multivariate Binary-State System
Sequential or online dimensional reduction is of interests due to the
explosion of streaming data based applications and the requirement of adaptive
statistical modeling, in many emerging fields, such as the modeling of energy
end-use profile. Principal Component Analysis (PCA), is the classical way of
dimensional reduction. However, traditional Singular Value Decomposition (SVD)
based PCA fails to model data which largely deviates from Gaussian
distribution. The Bregman Divergence was recently introduced to achieve a
generalized PCA framework. If the random variable under dimensional reduction
follows Bernoulli distribution, which occurs in many emerging fields, the
generalized PCA is called Logistic PCA (LPCA). In this paper, we extend the
batch LPCA to a sequential version (i.e. SLPCA), based on the sequential convex
optimization theory. The convergence property of this algorithm is discussed
compared to the batch version of LPCA (i.e. BLPCA), as well as its performance
in reducing the dimension for multivariate binary-state systems. Its
application in building energy end-use profile modeling is also investigated.Comment: 6 pages, 4 figures, conference submissio
Bayesian Inference on Matrix Manifolds for Linear Dimensionality Reduction
We reframe linear dimensionality reduction as a problem of Bayesian inference
on matrix manifolds. This natural paradigm extends the Bayesian framework to
dimensionality reduction tasks in higher dimensions with simpler models at
greater speeds. Here an orthogonal basis is treated as a single point on a
manifold and is associated with a linear subspace on which observations vary
maximally. Throughout this paper, we employ the Grassmann and Stiefel manifolds
for various dimensionality reduction problems, explore the connection between
the two manifolds, and use Hybrid Monte Carlo for posterior sampling on the
Grassmannian for the first time. We delineate in which situations either
manifold should be considered. Further, matrix manifold models are used to
yield scientific insight in the context of cognitive neuroscience, and we
conclude that our methods are suitable for basic inference as well as accurate
prediction.Comment: All datasets and computer programs are publicly available at
http://www.ics.uci.edu/~babaks/Site/Codes.htm
- …