5,894 research outputs found
Randomized Dimension Reduction on Massive Data
Scalability of statistical estimators is of increasing importance in modern
applications and dimension reduction is often used to extract relevant
information from data. A variety of popular dimension reduction approaches can
be framed as symmetric generalized eigendecomposition problems. In this paper
we outline how taking into account the low rank structure assumption implicit
in these dimension reduction approaches provides both computational and
statistical advantages. We adapt recent randomized low-rank approximation
algorithms to provide efficient solutions to three dimension reduction methods:
Principal Component Analysis (PCA), Sliced Inverse Regression (SIR), and
Localized Sliced Inverse Regression (LSIR). A key observation in this paper is
that randomization serves a dual role, improving both computational and
statistical performance. This point is highlighted in our experiments on real
and simulated data.Comment: 31 pages, 6 figures, Key Words:dimension reduction, generalized
eigendecompositon, low-rank, supervised, inverse regression, random
projections, randomized algorithms, Krylov subspace method
Penalized Orthogonal Iteration for Sparse Estimation of Generalized Eigenvalue Problem
We propose a new algorithm for sparse estimation of eigenvectors in
generalized eigenvalue problems (GEP). The GEP arises in a number of modern
data-analytic situations and statistical methods, including principal component
analysis (PCA), multiclass linear discriminant analysis (LDA), canonical
correlation analysis (CCA), sufficient dimension reduction (SDR) and invariant
co-ordinate selection. We propose to modify the standard generalized orthogonal
iteration with a sparsity-inducing penalty for the eigenvectors. To achieve
this goal, we generalize the equation-solving step of orthogonal iteration to a
penalized convex optimization problem. The resulting algorithm, called
penalized orthogonal iteration, provides accurate estimation of the true
eigenspace, when it is sparse. Also proposed is a computationally more
efficient alternative, which works well for PCA and LDA problems. Numerical
studies reveal that the proposed algorithms are competitive, and that our
tuning procedure works well. We demonstrate applications of the proposed
algorithm to obtain sparse estimates for PCA, multiclass LDA, CCA and SDR.
Supplementary materials are available online
Online learning algorithms for principal component analysis applied on single-lead ECGs
Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.This article evaluates several adaptive approaches to solve the principal component analysis (PCA) problem applied on single-lead ECGs. Recent studies have shown that the principal components can indicate morphologically or environmentally induced changes in the ECG signal and can be used to extract other vital information such as respiratory activity. Special interest is focused on the convergence behavior of the selected gradient algorithms, which is a major criterion for the usability of the gained results. As the right choice of learning rates is very data dependant and subject to movement artifacts, a new measurement system was designed, which uses acceleration data to improve the performance of the online algorithms. As the results of PCA seem very promising, we propose to apply a single-channel independent component analysis (SCICA) as a second step, which is investigated in this paper as well
A local Gaussian filter and adaptive morphology as tools for completing partially discontinuous curves
This paper presents a method for extraction and analysis of curve--type
structures which consist of disconnected components. Such structures are found
in electron--microscopy (EM) images of metal nanograins, which are widely used
in the field of nanosensor technology.
The topography of metal nanograins in compound nanomaterials is crucial to
nanosensor characteristics. The method of completing such templates consists of
three steps. In the first step, a local Gaussian filter is used with different
weights for each neighborhood. In the second step, an adaptive morphology
operation is applied to detect the endpoints of curve segments and connect
them. In the last step, pruning is employed to extract a curve which optimally
fits the template
- …