389 research outputs found

    Principal Boundary on Riemannian Manifolds

    Full text link
    We consider the classification problem and focus on nonlinear methods for classification on manifolds. For multivariate datasets lying on an embedded nonlinear Riemannian manifold within the higher-dimensional ambient space, we aim to acquire a classification boundary for the classes with labels, using the intrinsic metric on the manifolds. Motivated by finding an optimal boundary between the two classes, we invent a novel approach -- the principal boundary. From the perspective of classification, the principal boundary is defined as an optimal curve that moves in between the principal flows traced out from two classes of data, and at any point on the boundary, it maximizes the margin between the two classes. We estimate the boundary in quality with its direction, supervised by the two principal flows. We show that the principal boundary yields the usual decision boundary found by the support vector machine in the sense that locally, the two boundaries coincide. Some optimality and convergence properties of the random principal boundary and its population counterpart are also shown. We illustrate how to find, use and interpret the principal boundary with an application in real data.Comment: 31 pages,10 figure

    Fixed Boundary Flows

    Full text link
    We consider the fixed boundary flow with canonical interpretability as principal components extended on the non-linear Riemannian manifolds. We aim to find a flow with fixed starting and ending point for multivariate datasets lying on an embedded non-linear Riemannian manifold, differing from the principal flow that starts from the center of the data cloud. Both points are given in advance, using the intrinsic metric on the manifolds. From the perspective of geometry, the fixed boundary flow is defined as an optimal curve that moves in the data cloud. At any point on the flow, it maximizes the inner product of the vector field, which is calculated locally, and the tangent vector of the flow. We call the new flow the fixed boundary flow. The rigorous definition is given by means of an Euler-Lagrange problem, and its solution is reduced to that of a Differential Algebraic Equation (DAE). A high level algorithm is created to numerically compute the fixed boundary. We show that the fixed boundary flow yields a concatenate of three segments, one of which coincides with the usual principal flow when the manifold is reduced to the Euclidean space. We illustrate how the fixed boundary flow can be used and interpreted, and its application in real data

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Doctor of Philosophy in Computing

    Get PDF
    dissertationAn important area of medical imaging research is studying anatomical diffeomorphic shape changes and detecting their relationship to disease processes. For example, neurodegenerative disorders change the shape of the brain, thus identifying differences between the healthy control subjects and patients affected by these diseases can help with understanding the disease processes. Previous research proposed a variety of mathematical approaches for statistical analysis of geometrical brain structure in three-dimensional (3D) medical imaging, including atlas building, brain variability quantification, regression, etc. The critical component in these statistical models is that the geometrical structure is represented by transformations rather than the actual image data. Despite the fact that such statistical models effectively provide a way for analyzing shape variation, none of them have a truly probabilistic interpretation. This dissertation contributes a novel Bayesian framework of statistical shape analysis for generic manifold data and its application to shape variability and brain magnetic resonance imaging (MRI). After we carefully define the distributions on manifolds, we then build Bayesian models for analyzing the intrinsic variability of manifold data, involving the mean point, principal modes, and parameter estimation. Because there is no closed-form solution for Bayesian inference of these models on manifolds, we develop a Markov Chain Monte Carlo method to sample the hidden variables from the distribution. The main advantages of these Bayesian approaches are that they provide parameter estimation and automatic dimensionality reduction for analyzing generic manifold-valued data, such as diffeomorphisms. Modeling the mean point of a group of images in a Bayesian manner allows for learning the regularity parameter from data directly rather than having to set it manually, which eliminates the effort of cross validation for parameter selection. In population studies, our Bayesian model of principal modes analysis (1) automatically extracts a low-dimensional, second-order statistics of manifold data variability and (2) gives a better geometric data fit than nonprobabilistic models. To make this Bayesian framework computationally more efficient for high-dimensional diffeomorphisms, this dissertation presents an algorithm, FLASH (finite-dimensional Lie algebras for shooting), that hugely speeds up the diffeomorphic image registration. Instead of formulating diffeomorphisms in a continuous variational problem, Flash defines a completely new discrete reparameterization of diffeomorphisms in a low-dimensional bandlimited velocity space, which results in the Bayesian inference via sampling on the space of diffeomorphisms being more feasible in time. Our entire Bayesian framework in this dissertation is used for statistical analysis of shape data and brain MRIs. It has the potential to improve hypothesis testing, classification, and mixture models

    ๋ฆฌ๋งŒ๋‹ค์–‘์ฒด ์ƒ์˜ ๋น„๋ชจ์ˆ˜์  ์ฐจ์›์ถ•์†Œ๋ฐฉ๋ฒ•๋ก 

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ†ต๊ณ„ํ•™๊ณผ, 2022. 8. ์˜คํฌ์„.Over the decades, parametric dimension reduction methods have been actively developed for non-Euclidean data analysis. Examples include Fletcher et al., 2004; Huckemann et al., 2010; Jung et al., 2011; Jung et al., 2012; Zhang et al., 2013. Sometimes the methods are not enough to capture the structure of data. This dissertation presents newly developed nonparametric dimension reductions for data observed on manifold, resulting in more flexible fits. More precisely, the main focus is on the generalizations of principal curves into Riemannian manifold. The principal curve is considered as a nonlinear generalization of principal component analysis (PCA). The dissertation consists of four main parts as follows. First, the approach given in Chapter 3 lie in the same lines of Hastie (1984) and Hastie and Stuetzle (1989) that introduced the definition of original principal curve on Euclidean space. The main contributions of this study can be summarized as follows: (a) We propose both extrinsic and intrinsic approaches to form principal curves on spheres. (b) We establish the stationarity of the proposed principal curves on spheres. (c) In extensive numerical studies, we show the usefulness of the proposed method through real seismological data and real Human motion capture data as well as simulated data on 2-sphere, 4-sphere. Secondly, As one of further work in the previous approach, a robust nonparametric dimension reduction is proposed. To this ends, absolute loss and Huber loss are used rather than L2 loss. The contributions of Chapter 4 can be summarized as follows: (a) We study robust principal curves on spheres that are resistant to outliers. Specifically, we propose absolute-type and Huber-type principal curves, which go through the median of data, to robustify the principal curves for a set of data which may contain outliers. (b) For a theoretical aspect, the stationarity of the robust principal curves is investigated. (c) We provide practical algorithms for implementing the proposed robust principal curves, which are computationally feasible and more convenient to implement. Thirdly, An R package 'spherepc' comprehensively providing dimension reduction methods on a sphere is introduced with details for possible reproducible research. To the best of our knowledge, no available R packages offer the methods of dimension reduction and principal curves on a sphere. The existing R packages providing principal curves, such as 'princurve' and 'LPCM', are available only on Euclidean space. In addition, most nonparametric dimension reduction methods on manifold involve somewhat complex intrinsic optimizations. The proposed R package 'spherepc' provides the state-of-the-art principal curve technique on the sphere and comprehensively collects and implements the existing techniques. Lastly, for an effective initial estimate of complex structured data on manifold, local principal geodesics are first provided and the method is applied to various simulated and real seismological data. For variance stabilization and theoretical investigations for the procedure, nextly, the focus is on the generalization of Kรฉgl (1999); Kรฉgl et al., (2000), which provided the new definition of principal curve on Euclidean space, into generic Riemannian manifolds. Theories including consistency and convergence rate of the procedure by means of empirical risk minimization principle, are further established on generic Riemannian manifolds. The consequences on the real data analysis and simulation study show the promising characteristics of the proposed approach.๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ๋‹ค์–‘์ฒด ์ž๋ฃŒ์˜ ๋ณ€๋™์„ฑ์„ ๋”์šฑ ํšจ๊ณผ์ ์œผ๋กœ ์ฐพ์•„๋‚ด๊ธฐ ์œ„ํ•ด, ๋‹ค์–‘์ฒด ์ž๋ฃŒ์˜ ์ƒˆ๋กœ์šด ๋น„๋ชจ์ˆ˜์  ์ฐจ์›์ถ•์†Œ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•˜์˜€๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ์ฃผ๊ณก์„ (principal curves) ๋ฐฉ๋ฒ•์„ ์ผ๋ฐ˜์ ์ธ ๋‹ค์–‘์ฒด ๊ณต๊ฐ„์œผ๋กœ ํ™•์žฅํ•˜๋Š” ๊ฒƒ์ด ์ฃผ์š” ์—ฐ๊ตฌ ์ฃผ์ œ์ด๋‹ค. ์ฃผ๊ณก์„ ์€ ์ฃผ์„ฑ๋ถ„๋ถ„์„(PCA)์˜ ๋น„์„ ํ˜•์  ํ™•์žฅ ์ค‘ ํ•˜๋‚˜์ด๋ฉฐ, ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ํฌ๊ฒŒ ๋„ค ๊ฐ€์ง€์˜ ์ฃผ์ œ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, Hastie (1984), Hastie and Stuetzle (1989}์˜ ๋ฐฉ๋ฒ•์„ ์ž„์˜์˜ ์ฐจ์›์˜ ๊ตฌ๋ฉด์œผ๋กœ ํ‘œ์ค€์ ์ธ ๋ฐฉ์‹์œผ๋กœ ํ™•์žฅํ•œ๋‹ค. ์ด ์—ฐ๊ตฌ ์ฃผ์ œ์˜ ๊ณตํ—Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. (a) ์ž„์˜์˜ ์ฐจ์›์˜ ๊ตฌ๋ฉด์—์„œ ๋‚ด์žฌ์ , ์™ธ์žฌ์ ์ธ ๋ฐฉ์‹์˜ ์ฃผ๊ณก์„  ๋ฐฉ๋ฒ•์„ ๊ฐ๊ฐ ์ œ์•ˆํ•œ๋‹ค. (b) ๋ณธ ๋ฐฉ๋ฒ•์˜ ์ด๋ก ์  ์„ฑ์งˆ(์ •์ƒ์„ฑ)์„ ๊ทœ๋ช…ํ•œ๋‹ค. (c) ์ง€์งˆํ•™์  ์ž๋ฃŒ ๋ฐ ์ธ๊ฐ„ ์›€์ง์ž„ ์ž๋ฃŒ ๋“ฑ์˜ ์‹ค์ œ ์ž๋ฃŒ์™€ 2์ฐจ์›, 4์ฐจ์› ๊ตฌ๋ฉด์œ„์˜ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์ž๋ฃŒ์— ๋ณธ ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ, ๊ทธ ์œ ์šฉ์„ฑ์„ ๋ณด์ธ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ, ์ฒซ ๋ฒˆ์งธ ์ฃผ์ œ์˜ ํ›„์† ์—ฐ๊ตฌ ์ค‘ ํ•˜๋‚˜๋กœ์„œ, ๋‘๊บผ์šด ๊ผฌ๋ฆฌ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๋Š” ์ž๋ฃŒ์— ๋Œ€ํ•˜์—ฌ ๊ฐ•๊ฑดํ•œ ๋น„๋ชจ์ˆ˜์  ์ฐจ์›์ถ•์†Œ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด, L2 ์†์‹คํ•จ์ˆ˜ ๋Œ€์‹ ์— L1 ๋ฐ ํœด๋ฒ„(Huber) ์†์‹คํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•œ๋‹ค. ์ด ์—ฐ๊ตฌ ์ฃผ์ œ์˜ ๊ณตํ—Œ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. (a) ์ด์ƒ์น˜์— ๋ฏผ๊ฐํ•˜์ง€ ์•Š์€ ๊ฐ•๊ฑดํ™”์ฃผ๊ณก์„ (robust principal curves)์„ ์ •์˜ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ์ž๋ฃŒ์˜ ๊ธฐํ•˜์  ์ค‘์‹ฌ์ ์„ ์ง€๋‚˜๋Š” L1 ๋ฐ ํœด๋ฒ„ ์†์‹คํ•จ์ˆ˜์— ๋Œ€์‘๋˜๋Š” ์ƒˆ๋กœ์šด ์ฃผ๊ณก์„ ์„ ์ œ์•ˆํ•œ๋‹ค. (b) ์ด๋ก ์ ์ธ ์ธก๋ฉด์—์„œ, ๊ฐ•๊ฑดํ™”์ฃผ๊ณก์„ ์˜ ์ •์ƒ์„ฑ์„ ๊ทœ๋ช…ํ•œ๋‹ค. (c) ๊ฐ•๊ฑดํ™”์ฃผ๊ณก์„ ์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ๊ณ„์‚ฐ์ด ๋น ๋ฅธ ์‹ค์šฉ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์„ธ ๋ฒˆ์งธ๋กœ, ๊ธฐ์กด์˜ ์ฐจ์›์ถ•์†Œ๋ฐฉ๋ฒ• ๋ฐ ๋ณธ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ๊ณตํ•˜๋Š” R ํŒจํ‚ค์ง€๋ฅผ ๊ตฌํ˜„ํ•˜์˜€์œผ๋ฉฐ ์ด๋ฅผ ๋‹ค์–‘ํ•œ ์˜ˆ์ œ ๋ฐ ์„ค๋ช…๊ณผ ํ•จ๊ป˜ ์†Œ๊ฐœํ•œ๋‹ค. ๋ณธ ๋ฐฉ๋ฒ•๋ก ์˜ ๊ฐ•์ ์€ ๋‹ค์–‘์ฒด ์œ„์—์„œ์˜ ๋ณต์žกํ•œ ์ตœ์ ํ™” ๋ฐฉ์ •์‹์„ ํ’€์ง€์•Š๊ณ , ์ง๊ด€์ ์ธ ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์ด๋‹ค. R ํŒจํ‚ค์ง€๋กœ ๊ตฌํ˜„๋˜์–ด ์ œ๊ณต๋œ๋‹ค๋Š” ์ ์ด ์ด๋ฅผ ๋ฐฉ์ฆํ•˜๋ฉฐ, ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์˜ ์—ฐ๊ตฌ๋ฅผ ์žฌํ˜„๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ณด๋‹ค ๋ณต์žกํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋Š” ๋‹ค์–‘์ฒด ์ž๋ฃŒ์˜ ๊ตฌ์กฐ๋ฅผ ์ถ”์ •ํ•˜๊ธฐ์œ„ํ•ด, ๊ตญ์†Œ์ฃผ์ธก์ง€์„ ๋ถ„์„(local principal geodesics) ๋ฐฉ๋ฒ•์„ ์šฐ์„  ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ฐฉ๋ฒ•์„ ์‹ค์ œ ์ง€์งˆํ•™ ์ž๋ฃŒ ๋ฐ ๋‹ค์–‘ํ•œ ๋ชจ์˜์‹คํ—˜ ์ž๋ฃŒ์— ์ ์šฉํ•˜์—ฌ ๊ทธ ํ™œ์šฉ์„ฑ์„ ๋ณด์˜€๋‹ค. ๋‹ค์Œ์œผ๋กœ, ์ถ”์ •์น˜์˜ ๋ถ„์‚ฐ์•ˆ์ •ํ™” ๋ฐ ์ด๋ก ์  ์ •๋‹นํ™”๋ฅผ ์œ„ํ•˜์—ฌ Kรฉgl (1999), Kรฉgl et al., (2000) ๋ฐฉ๋ฒ•์„ ์ผ๋ฐ˜์ ์ธ ๋ฆฌ๋งŒ๋‹ค์–‘์ฒด๋กœ ํ™•์žฅํ•œ๋‹ค. ๋” ๋‚˜์•„๊ฐ€, ๋ฐฉ๋ฒ•๋ก ์˜ ์ผ์น˜์„ฑ, ์ˆ˜๋ ด์†๋„์™€ ๊ฐ™์€ ์ ๊ทผ์  ์„ฑ์งˆ์„ ๋น„๋กฏํ•˜์—ฌ ๋น„์ ๊ทผ์  ์„ฑ์งˆ์ธ ์ง‘์ค‘๋ถ€๋“ฑ์‹(concentration inequality)์„ ํ†ต๊ณ„์ ํ•™์Šต์ด๋ก ์„ ์ด์šฉํ•˜์—ฌ ๊ทœ๋ช…ํ•œ๋‹ค.1 Introduction 1 2 Preliminaries 8 2.1 Principal curves 8 2.1 Riemannian manifolds and centrality on manifold 10 2.1 Principal curves on Riemannian manifolds 14 3 Spherical principal curves 15 3.1 Enhancement of principal circle for initialization 16 3.2 Proposed principal curves 25 3.3 Numerical experiments 34 3.4 Proofs 45 3.5 Concluding remarks 62 4 Robust spherical principal curves 64 4.1 The proposed robust principal curves 64 4.2 Stationarity of robust spherical principal curves 72 4.3 Numerical experiments 74 4.4 Summary and future work 80 5 spherepc: An R package for dimension reduction on a sphere 84 5.1 Existing methods 85 5.2 Spherical principal curves 91 5.3 Local principal geodesics 94 5.4 Application 99 5.5 Conclusion 101 6 Local principal curves on Riemannian manifolds 112 6.1 Preliminaries 116 6.2 Local principal geodesics 118 6.3 Local principal curves 125 6.4 Real data analysis 133 6.5 Further work 133 7 Conclusion 139 A. Appendix 141 A.1. Appendix for Chapter 3 141 A.2. Appendix for Chapter 4 145 A.3. Appendix for Chapter 6 152 Abstract in Korean 176 Acknowledgement in Korean 179๋ฐ•

    Nonparametric Uncertainty Quantification for Stochastic Gradient Flows

    Full text link
    This paper presents a nonparametric statistical modeling method for quantifying uncertainty in stochastic gradient systems with isotropic diffusion. The central idea is to apply the diffusion maps algorithm to a training data set to produce a stochastic matrix whose generator is a discrete approximation to the backward Kolmogorov operator of the underlying dynamics. The eigenvectors of this stochastic matrix, which we will refer to as the diffusion coordinates, are discrete approximations to the eigenfunctions of the Kolmogorov operator and form an orthonormal basis for functions defined on the data set. Using this basis, we consider the projection of three uncertainty quantification (UQ) problems (prediction, filtering, and response) into the diffusion coordinates. In these coordinates, the nonlinear prediction and response problems reduce to solving systems of infinite-dimensional linear ordinary differential equations. Similarly, the continuous-time nonlinear filtering problem reduces to solving a system of infinite-dimensional linear stochastic differential equations. Solving the UQ problems then reduces to solving the corresponding truncated linear systems in finitely many diffusion coordinates. By solving these systems we give a model-free algorithm for UQ on gradient flow systems with isotropic diffusion. We numerically verify these algorithms on a 1-dimensional linear gradient flow system where the analytic solutions of the UQ problems are known. We also apply the algorithm to a chaotically forced nonlinear gradient flow system which is known to be well approximated as a stochastically forced gradient flow.Comment: Find the associated videos at: http://personal.psu.edu/thb11
    • โ€ฆ
    corecore