397,640 research outputs found
Fast, Exact Bootstrap Principal Component Analysis for p>1 million
Many have suggested a bootstrap procedure for estimating the sampling
variability of principal component analysis (PCA) results. However, when the
number of measurements per subject () is much larger than the number of
subjects (), the challenge of calculating and storing the leading principal
components from each bootstrap sample can be computationally infeasible. To
address this, we outline methods for fast, exact calculation of bootstrap
principal components, eigenvalues, and scores. Our methods leverage the fact
that all bootstrap samples occupy the same -dimensional subspace as the
original sample. As a result, all bootstrap principal components are limited to
the same -dimensional subspace and can be efficiently represented by their
low dimensional coordinates in that subspace. Several uncertainty metrics can
be computed solely based on the bootstrap distribution of these low dimensional
coordinates, without calculating or storing the -dimensional bootstrap
components. Fast bootstrap PCA is applied to a dataset of sleep
electroencephalogram (EEG) recordings (, ), and to a dataset of
brain magnetic resonance images (MRIs) ( 3 million, ). For the
brain MRI dataset, our method allows for standard errors for the first 3
principal components based on 1000 bootstrap samples to be calculated on a
standard laptop in 47 minutes, as opposed to approximately 4 days with standard
methods.Comment: 25 pages, including 9 figures and link to R package. 2014-05-14
update: final formatting edits for journal submission, condensed figure
Identifying common dynamic features in stock returns
This paper proposes volatility and spectral based methods for cluster analysis of stock returns. Using the information about both the estimated parameters in the threshold GARCH (or TGARCH) equation and the periodogram of the squared returns, we compute a distance matrix for the stock returns. Clusters are formed by looking to the hierarchical structure tree (or dendrogram) and the computed principal coordinates. We employ these techniques to investigate the similarities and dissimilarities between the "blue-chip" stocks used to compute the Dow Jones Industrial Average (DJIA) index.Asymmetric effects, Cluster analysis, DJIA stock returns, Periodogram, Threshold GARCH model, Volatility
ANISAP: A three-dimensional finite element program for laminated composites subjected to mechanical loading
ANISAP is a 3-D finite element FORTRAN 77 computer code for linear elastic, small strain, analysis of laminated composites with arbitrary geometry including free edges and holes. Individual layers may be isotropic or transversely isotropic in material principal coordinates; individual layers may be rotated off-axis about a global z-axis. The laminate may be a hybrid. Three different isoparametric elements, variable order of gaussian integration, calculation of stresses at element boundaries, and loading by either nodal displacement of forces are included in the program capability. Post processing capability includes failure analysis using the tensor polynominal failure criterion
Reconstructing the free-energy landscape of Met-enkephalin using dihedral Principal Component Analysis and Well-tempered Metadynamics
Well-Tempered Metadynamics (WTmetaD) is an efficient method to enhance the
reconstruction of the free-energy surface of proteins. WTmetaD guarantees a
faster convergence in the long time limit in comparison with the standard
metadynamics. It still suffers however from the same limitation, i.e. the non
trivial choice of pertinent collective variables (CVs). To circumvent this
problem, we couple WTmetaD with a set of CVs generated from a dihedral
Principal Component Analysis (dPCA) on the Ramachadran dihedral angles
describing the backbone structure of the protein. The dPCA provides a generic
method to extract relevant CVs built from internal coordinates. We illustrate
the robustness of this method in the case of the small and very diffusive
Metenkephalin pentapeptide, and highlight a criterion to limit the number of
CVs necessary to biased the metadynamics simulation. The free-energy landscape
(FEL) of Met-enkephalin built on CVs generated from dPCA is found rugged
compared with the FEL built on CVs extracted from PCA of the Cartesian
coordinates of the atoms.Comment: 17 pages, 9 figures (4 in color
Robust Principal Component Analysis for Compositional Tables
A data table which is arranged according to two factors can often be
considered as a compositional table. An example is the number of unemployed
people, split according to gender and age classes. Analyzed as compositions,
the relevant information would consist of ratios between different cells of
such a table. This is particularly useful when analyzing several compositional
tables jointly, where the absolute numbers are in very different ranges, e.g.
if unemployment data are considered from different countries. Within the
framework of the logratio methodology, compositional tables can be decomposed
into independent and interactive parts, and orthonormal coordinates can be
assigned to these parts. However, these coordinates usually require some prior
knowledge about the data, and they are not easy to handle for exploring the
relationships between the given factors.
Here we propose a special choice of coordinates with a direct relation to
centered logratio (clr) coefficients, which are particularly useful for an
interpretation in terms of the original cells of the tables. With these
coordinates, robust principal component analysis (PCA) is performed for
dimension reduction, allowing to investigate the relationships between the
factors. The link between orthonormal coordinates and clr coefficients enables
to apply robust PCA, which would otherwise suffer from the singularity of clr
coefficients.Comment: 20 pages, 2 figure
- …
