31,986 research outputs found
Cauchy robust principal component analysis with applications to high-deimensional data sets
Principal component analysis (PCA) is a standard dimensionality reduction
technique used in various research and applied fields. From an algorithmic
point of view, classical PCA can be formulated in terms of operations on a
multivariate Gaussian likelihood. As a consequence of the implied Gaussian
formulation, the principal components are not robust to outliers. In this
paper, we propose a modified formulation, based on the use of a multivariate
Cauchy likelihood instead of the Gaussian likelihood, which has the effect of
robustifying the principal components. We present an algorithm to compute these
robustified principal components. We additionally derive the relevant influence
function of the first component and examine its theoretical properties.
Simulation experiments on high-dimensional datasets demonstrate that the
estimated principal components based on the Cauchy likelihood outperform or are
on par with existing robust PCA techniques
Toroidal PCA via density ridges
Principal Component Analysis (PCA) is a well-known linear dimension-reduction
technique designed for Euclidean data. In a wide spectrum of applied fields,
however, it is common to observe multivariate circular data (also known as
toroidal data), rendering spurious the use of PCA on it due to the periodicity
of its support. This paper introduces Toroidal Ridge PCA (TR-PCA), a novel
construction of PCA for bivariate circular data that leverages the concept of
density ridges as a flexible first principal component analog. Two reference
bivariate circular distributions, the bivariate sine von Mises and the
bivariate wrapped Cauchy, are employed as the parametric distributional basis
of TR-PCA. Efficient algorithms are presented to compute density ridges for
these two distribution models. A complete PCA methodology adapted to toroidal
data (including scores, variance decomposition, and resolution of edge cases)
is introduced and implemented in the companion R package ridgetorus. The
usefulness of TR-PCA is showcased with a novel case study involving the
analysis of ocean currents on the coast of Santa Barbara.Comment: 20 pages, 8 figures, 1 tabl
Detecting and handling outlying trajectories in irregularly sampled functional datasets
Outlying curves often occur in functional or longitudinal datasets, and can
be very influential on parameter estimators and very hard to detect visually.
In this article we introduce estimators of the mean and the principal
components that are resistant to, and then can be used for detection of,
outlying sample trajectories. The estimators are based on reduced-rank t-models
and are specifically aimed at sparse and irregularly sampled functional data.
The outlier-resistance properties of the estimators and their relative
efficiency for noncontaminated data are studied theoretically and by
simulation. Applications to the analysis of Internet traffic data and glycated
hemoglobin levels in diabetic children are presented.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS257 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Invalidity of the Laplace Law for Biological Vessels and of Estimating Elastic Modulus from Total Stress vs. Strain: a New Practical Method
The quantification of the stiffness of tubular biological structures is often
obtained, both in vivo and in vitro, as the slope of total transmural hoop
stress plotted against hoop strain. Total hoop stress is typically estimated
using the "Laplace law." We show that this procedure is fundamentally flawed
for two reasons: Firstly, the Laplace law predicts total stress incorrectly for
biological vessels. Furthermore, because muscle and other biological tissue are
closely volume-preserving, quantifications of elastic modulus require the
removal of the contribution to total stress from incompressibility. We show
that this hydrostatic contribution to total stress has a strong
material-dependent nonlinear response to deformation that is difficult to
predict or measure. To address this difficulty, we propose a new practical
method to estimate a mechanically viable modulus of elasticity that can be
applied both in vivo and in vitro using the same measurements as current
methods, with care taken to record the reference state. To be insensitive to
incompressibility, our method is based on shear stress rather than hoop stress,
and provides a true measure of the elastic response without application of the
Laplace law. We demonstrate the accuracy of our method using a mathematical
model of tube inflation with multiple constitutive models. We also re-analyze
an in vivo study from the gastro-intestinal literature that applied the
standard approach and concluded that a drug-induced change in elastic modulus
depended on the protocol used to distend the esophageal lumen. Our new method
removes this protocol-dependent inconsistency in the previous result.Comment: 34 pages, 13 figure
Fast and accurate con-eigenvalue algorithm for optimal rational approximations
The need to compute small con-eigenvalues and the associated con-eigenvectors
of positive-definite Cauchy matrices naturally arises when constructing
rational approximations with a (near) optimally small error.
Specifically, given a rational function with poles in the unit disk, a
rational approximation with poles in the unit disk may be obtained
from the th con-eigenvector of an Cauchy matrix, where the
associated con-eigenvalue gives the approximation error in the
norm. Unfortunately, standard algorithms do not accurately compute
small con-eigenvalues (and the associated con-eigenvectors) and, in particular,
yield few or no correct digits for con-eigenvalues smaller than the machine
roundoff. We develop a fast and accurate algorithm for computing
con-eigenvalues and con-eigenvectors of positive-definite Cauchy matrices,
yielding even the tiniest con-eigenvalues with high relative accuracy. The
algorithm computes the th con-eigenvalue in operations
and, since the con-eigenvalues of positive-definite Cauchy matrices decay
exponentially fast, we obtain (near) optimal rational approximations in
operations, where is the
approximation error in the norm. We derive error bounds
demonstrating high relative accuracy of the computed con-eigenvalues and the
high accuracy of the unit con-eigenvectors. We also provide examples of using
the algorithm to compute (near) optimal rational approximations of functions
with singularities and sharp transitions, where approximation errors close to
machine precision are obtained. Finally, we present numerical tests on random
(complex-valued) Cauchy matrices to show that the algorithm computes all the
con-eigenvalues and con-eigenvectors with nearly full precision
- …