59 research outputs found

    Local Principal Curves

    Get PDF
    Principal components are a well established tool in dimension reduction. The extension to principal curves allows for general smooth curves which pass through the middle of a p-dimensional data cloud. In this paper local principal curves are introduced, which are based on the localization of principal component analysis. The proposed algorithm is able to identify closed curves as well as multiple curves which may or may not be connected. For the evaluation of performance of data reduction obtained by principal curves a measure of coverage is suggested. The selection of tuning parameters is considered explicitely yielding an algorithm which is easy to apply. By use of simulated and real data sets the approach is compared to various alternative concepts of principal curves

    Exploring multivariate data structures with local principal curves.

    Get PDF
    A new approach to find the underlying structure of a multidimensional data cloud is proposed, which is based on a localized version of principal components analysis. More specifically, we calculate a series of local centers of mass and move through the data in directions given by the first local principal axis. One obtains a smooth ``local principal curve'' passing through the "middle" of a multivariate data cloud. The concept adopts to branched curves by considering the second local principal axis. Since the algorithm is based on a simple eigendecomposition, computation is fast and easy

    Data compression and regression based on local principal curves.

    Get PDF
    Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, …, x_p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + … + m_p (x_p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem. As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences

    Data compression and regression based on local principal curves

    Get PDF
    Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, …, x_p ) + ε is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + … + m_p (x_p ) + ε, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem. As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences

    Implementation of a local principal curves algorithm for neutrino interaction reconstruction in a liquid argon volume

    Get PDF
    A local principal curve algorithm has been implemented in three dimensions for automated track and shower reconstruction of neutrino interactions in a liquid argon time projection chamber. We present details of the algorithm and characterise its performance on simulated data sets.Comment: 14 pages, 17 figures; typing correction to Eq 5, the definition of the local covariance matri

    Data compression and regression through local principal curves and surfaces

    Get PDF
    We consider principal curves and surfaces in the context of multivariate regression modelling. For predictor spaces featuring complex dependency patterns between the involved variables, the intrinsic dimensionality of the data tends to be very small due to the high redundancy induced by the dependencies. In situations of this type, it is useful to approximate the high-dimensional predictor space through a low-dimensional manifold (i.e., a curve or a surface), and use the projections onto the manifold as compressed predictors in the regression problem. In the case that the intrinsic dimensionality of the predictor space equals one, we use the local principal curve algorithm for the the compression step. We provide a novel algorithm which extends this idea to local principal surfaces, thus covering cases of an intrinsic dimensionality equal to two, which is in principle extendible to manifolds of arbitrary dimension. We motivate and apply the novel techniques using astrophysical and oceanographic data examples

    Curve Estimation Based on Localised Principal Components - Theory and Applications

    Get PDF
    In this work, basic theory and some proposed developments to localised principal components and curves are introduced. In addition, some areas of application for local principal curves are explored. Only relatively recently, localised principal components utilising kernel-type weights have found their way into the statistical literature. In this study, the asymptotic behaviour of the method is investigated and extended to the context of local principal curves, where the characteristics of the points at which the curve stops at the edges are identified. This is used to develop a method that lets the curve `delay' convergence if desired, gaining more access to boundary regions of the data. Also, a method for automatic choice of the starting point to be one of the local modes within the data cloud is originated. The modified local principal curves' algorithm is then used for fitting multi-dimensional econometric data. Special attention is given to the role of the curve parametrisation, which serves as a feature extractor and also as a prediction tool when properly linked to time as a probable underlying latent variable. Local principal curves provide a good dimensionality reduction and feature extraction tool for insurance industry key indicators and consumer price indices. Also, through `calibrating' it with time, curve parametrisation is used for the purpose of predicting unemployment and inflation rates

    Representing complex data using localized principal components with application to astronomical data

    Full text link
    Often the relation between the variables constituting a multivariate data space might be characterized by one or more of the terms: ``nonlinear'', ``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or, more general, ``complex''. In these cases, simple principal component analysis (PCA) as a tool for dimension reduction can fail badly. Of the many alternative approaches proposed so far, local approximations of PCA are among the most promising. This paper will give a short review of localized versions of PCA, focusing on local principal curves and local partitioning algorithms. Furthermore we discuss projections other than the local principal components. When performing local dimension reduction for regression or classification problems it is important to focus not only on the manifold structure of the covariates, but also on the response variable(s). Local principal components only achieve the former, whereas localized regression approaches concentrate on the latter. Local projection directions derived from the partial least squares (PLS) algorithm offer an interesting trade-off between these two objectives. We apply these methods to several real data sets. In particular, we consider simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds), Lecture Notes in Computational Science and Engineering, Springer, 2007, pp. 180--204, http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-
    • …
    corecore