35,296 research outputs found
Data compression and regression based on local principal curves.
Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, …, x_p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + … + m_p (x_p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem.
As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences
Principal Boundary on Riemannian Manifolds
We consider the classification problem and focus on nonlinear methods for
classification on manifolds. For multivariate datasets lying on an embedded
nonlinear Riemannian manifold within the higher-dimensional ambient space, we
aim to acquire a classification boundary for the classes with labels, using the
intrinsic metric on the manifolds. Motivated by finding an optimal boundary
between the two classes, we invent a novel approach -- the principal boundary.
From the perspective of classification, the principal boundary is defined as an
optimal curve that moves in between the principal flows traced out from two
classes of data, and at any point on the boundary, it maximizes the margin
between the two classes. We estimate the boundary in quality with its
direction, supervised by the two principal flows. We show that the principal
boundary yields the usual decision boundary found by the support vector machine
in the sense that locally, the two boundaries coincide. Some optimality and
convergence properties of the random principal boundary and its population
counterpart are also shown. We illustrate how to find, use and interpret the
principal boundary with an application in real data.Comment: 31 pages,10 figure
Representing complex data using localized principal components with application to astronomical data
Often the relation between the variables constituting a multivariate data
space might be characterized by one or more of the terms: ``nonlinear'',
``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or,
more general, ``complex''. In these cases, simple principal component analysis
(PCA) as a tool for dimension reduction can fail badly. Of the many alternative
approaches proposed so far, local approximations of PCA are among the most
promising. This paper will give a short review of localized versions of PCA,
focusing on local principal curves and local partitioning algorithms.
Furthermore we discuss projections other than the local principal components.
When performing local dimension reduction for regression or classification
problems it is important to focus not only on the manifold structure of the
covariates, but also on the response variable(s). Local principal components
only achieve the former, whereas localized regression approaches concentrate on
the latter. Local projection directions derived from the partial least squares
(PLS) algorithm offer an interesting trade-off between these two objectives. We
apply these methods to several real data sets. In particular, we consider
simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and
Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds),
Lecture Notes in Computational Science and Engineering, Springer, 2007, pp.
180--204,
http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-
Force dipoles and stable local defects on fluid vesicles
An exact description is provided of an almost spherical fluid vesicle with a
fixed area and a fixed enclosed volume locally deformed by external normal
forces bringing two nearby points on the surface together symmetrically. The
conformal invariance of the two-dimensional bending energy is used to identify
the distribution of energy as well as the stress established in the vesicle.
While these states are local minima of the energy, this energy is degenerate;
there is a zero mode in the energy fluctuation spectrum, associated with area
and volume preserving conformal transformations, which breaks the symmetry
between the two points. The volume constraint fixes the distance , measured
along the surface, between the two points; if it is relaxed, a second zero mode
appears, reflecting the independence of the energy on ; in the absence of
this constraint a pathway opens for the membrane to slip out of the defect.
Logarithmic curvature singularities in the surface geometry at the points of
contact signal the presence of external forces. The magnitude of these forces
varies inversely with and so diverges as the points merge; the
corresponding torques vanish in these defects. The geometry behaves near each
of the singularities as a biharmonic monopole, in the region between them as a
surface of constant mean curvature, and in distant regions as a biharmonic
quadrupole. Comparison of the distribution of stress with the quadratic
approximation in the height functions points to shortcomings of the latter
representation. Radial tension is accompanied by lateral compression, both near
the singularities and far away, with a crossover from tension to compression
occurring in the region between them.Comment: 26 pages, 10 figure
- …