Search CORE

35,296 research outputs found

Data compression and regression based on local principal curves.

Author: Einbeck J
Evers L
Fink A
Hinchliff K
Lausen B
Seidel W
Ultsch A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, …, x_p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + … + m_p (x_p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem. As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences

Durham Research Online

Enlighten

Principal Boundary on Riemannian Manifolds

Author: Yao Zhigang
Zhang Zhenyue
Publication venue
Publication date: 30/03/2019
Field of study

We consider the classification problem and focus on nonlinear methods for classification on manifolds. For multivariate datasets lying on an embedded nonlinear Riemannian manifold within the higher-dimensional ambient space, we aim to acquire a classification boundary for the classes with labels, using the intrinsic metric on the manifolds. Motivated by finding an optimal boundary between the two classes, we invent a novel approach -- the principal boundary. From the perspective of classification, the principal boundary is defined as an optimal curve that moves in between the principal flows traced out from two classes of data, and at any point on the boundary, it maximizes the margin between the two classes. We estimate the boundary in quality with its direction, supervised by the two principal flows. We show that the principal boundary yields the usual decision boundary found by the support vector machine in the sense that locally, the two boundaries coincide. Some optimality and convergence properties of the random principal boundary and its population counterpart are also shown. We illustrate how to find, use and interpret the principal boundary with an application in real data.Comment: 31 pages,10 figure

arXiv.org e-Print Archive

ScholarBank@NUS

FigShare

Representing complex data using localized principal components with application to astronomical data

Author: A Gersho
A Gorban
AH Monaghan
AR Webb
B Chalmond
B Kégl
C Allende Prieto
CAL Bailer-Jones
CAL Bailer-Jones
DJ Marchette
E Diday
E Oja
EC Malthouse
EM Braverman
FL Hall
H Hotelling
H Späth
H Wold
IT Jolliffe
J Einbeck
J Einbeck
JH Friedman
JH Friedman
JH Friedman
JJ Verbeek
JM Chambers
K Fukunaga
K Hornik
L Breiman
MAC Perryman
MG Kendall
N Kambhatla
P Delicado
P Delicado
PG Willemsen
R Tibshirani
RJ Bolton
S de Jong
T Aluja-Banet
T Duchamps
T Hastie
T Hastie
WS Cleveland
Z-Y Liu
Publication venue
Publication date: 01/01/2007
Field of study

Often the relation between the variables constituting a multivariate data space might be characterized by one or more of the terms: ``nonlinear'', ``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or, more general, ``complex''. In these cases, simple principal component analysis (PCA) as a tool for dimension reduction can fail badly. Of the many alternative approaches proposed so far, local approximations of PCA are among the most promising. This paper will give a short review of localized versions of PCA, focusing on local principal curves and local partitioning algorithms. Furthermore we discuss projections other than the local principal components. When performing local dimension reduction for regression or classification problems it is important to focus not only on the manifold structure of the covariates, but also on the response variable(s). Local principal components only achieve the former, whereas localized regression approaches concentrate on the latter. Local projection directions derived from the partial least squares (PLS) algorithm offer an interesting trade-off between these two objectives. We apply these methods to several real data sets. In particular, we consider simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds), Lecture Notes in Computational Science and Engineering, Springer, 2007, pp. 180--204, http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-

arXiv.org e-Print Archive

Durham Research Online

Crossref

Enlighten

Explore Bristol Research

Force dipoles and stable local defects on fluid vesicles

Author: A. Gray
A. M. Polyakov
A. T. Fomenko
Jemal Guven
Pablo Vázquez-Montejo
T. J. Willmore
W. Helfrich
Publication venue: 'American Physical Society (APS)'
Publication date: 16/04/2013
Field of study

An exact description is provided of an almost spherical fluid vesicle with a fixed area and a fixed enclosed volume locally deformed by external normal forces bringing two nearby points on the surface together symmetrically. The conformal invariance of the two-dimensional bending energy is used to identify the distribution of energy as well as the stress established in the vesicle. While these states are local minima of the energy, this energy is degenerate; there is a zero mode in the energy fluctuation spectrum, associated with area and volume preserving conformal transformations, which breaks the symmetry between the two points. The volume constraint fixes the distance

S

, measured along the surface, between the two points; if it is relaxed, a second zero mode appears, reflecting the independence of the energy on

S

; in the absence of this constraint a pathway opens for the membrane to slip out of the defect. Logarithmic curvature singularities in the surface geometry at the points of contact signal the presence of external forces. The magnitude of these forces varies inversely with

S

and so diverges as the points merge; the corresponding torques vanish in these defects. The geometry behaves near each of the singularities as a biharmonic monopole, in the region between them as a surface of constant mean curvature, and in distant regions as a biharmonic quadrupole. Comparison of the distribution of stress with the quadratic approximation in the height functions points to shortcomings of the latter representation. Radial tension is accompanied by lateral compression, both near the singularities and far away, with a crossover from tension to compression occurring in the region between them.Comment: 26 pages, 10 figure

arXiv.org e-Print Archive

Crossref