11 research outputs found

    Backwards Principal Component Analysis and Principal Nested Relations

    Get PDF
    In non-Euclidean data spaces represented by manifolds (or more generally stratified spaces), analogs of principal component analysis can be more easily developed using a backwards approach. There has been a gradual evolution in the application of this idea from using increasing geodesic subspaces of submanifolds in analogy with PCA to using a “backward sequence” of a decreasing family of subspaces. We provide a version of the backwards approach by using a “nested sequence of relations” which define the decreasing sequences of subspaces which need not be geodesic. Because these are naturally inductively added in a backward sequence, they are frequently more tractable and overcome difficulties with using geodesics

    Non-Euclidean classification of medically imaged objects via s-reps

    Get PDF
    AbstractClassifying medically imaged objects, e.g., into diseased and normal classes, has been one of the important goals in medical imaging. We propose a novel classification scheme that uses a skeletal representation to provide rich non-Euclidean geometric object properties. Our statistical method combines distance weighted discrimination (DWD) with a carefully chosen Euclideanization which takes full advantage of the geometry of the manifold on which these non-Euclidean geometric object properties (GOPs) live. Our method is evaluated via the task of classifying 3D hippocampi between schizophrenics and healthy controls. We address three central questions. 1) Does adding shape features increase discriminative power over the more standard classification based only on global volume? 2) If so, does our skeletal representation provide greater discriminative power than a conventional boundary point distribution model (PDM)? 3) Especially, is Euclideanization of non-Euclidean shape properties important in achieving high discriminative power? Measuring the capability of a method in terms of area under the receiver operator characteristic (ROC) curve, we show that our proposed method achieves strongly better classification than both the classification method based on global volume alone and the s-rep-based classification method without proper Euclideanization of non-Euclidean GOPs. We show classification using Euclideanized s-reps is also superior to classification using PDMs, whether the PDMs are first Euclideanized or not. We also show improved performance with Euclideanized boundary PDMs over non-linear boundary PDMs. This demonstrates the benefit that proper Euclideanization of non-Euclidean GOPs brings not only to s-rep-based classification but also to PDM-based classification

    Skeletal Shape Correspondence Through Entropy

    Get PDF
    We present a novel approach for improving the shape statistics of medical image objects by generating correspondence of skeletal points. Each object's interior is modeled by an s-rep, i.e., by a sampled, folded, two-sided skeletal sheet with spoke vectors proceeding from the skeletal sheet to the boundary. The skeleton is divided into three parts: the up side, the down side, and the fold curve. The spokes on each part are treated separately and, using spoke interpolation, are shifted along that skeleton in each training sample so as to tighten the probability distribution on those spokes' geometric properties while sampling the object interior regularly. As with the surface/boundary-based correspondence method of Cates et al., entropy is used to measure both the probability distribution tightness and the sampling regularity, here of the spokes' geometric properties. Evaluation on synthetic and real world lateral ventricle and hippocampus data sets demonstrate improvement in the performance of statistics using the resulting probability distributions. This improvement is greater than that achieved by an entropy-based correspondence method on the boundary points

    Barycentric Subspace Analysis on Manifolds

    Full text link
    This paper investigates the generalization of Principal Component Analysis (PCA) to Riemannian manifolds. We first propose a new and general type of family of subspaces in manifolds that we call barycentric subspaces. They are implicitly defined as the locus of points which are weighted means of k+1k+1 reference points. As this definition relies on points and not on tangent vectors, it can also be extended to geodesic spaces which are not Riemannian. For instance, in stratified spaces, it naturally allows principal subspaces that span several strata, which is impossible in previous generalizations of PCA. We show that barycentric subspaces locally define a submanifold of dimension k which generalizes geodesic subspaces.Second, we rephrase PCA in Euclidean spaces as an optimization on flags of linear subspaces (a hierarchy of properly embedded linear subspaces of increasing dimension). We show that the Euclidean PCA minimizes the Accumulated Unexplained Variances by all the subspaces of the flag (AUV). Barycentric subspaces are naturally nested, allowing the construction of hierarchically nested subspaces. Optimizing the AUV criterion to optimally approximate data points with flags of affine spans in Riemannian manifolds lead to a particularly appealing generalization of PCA on manifolds called Barycentric Subspaces Analysis (BSA).Comment: Annals of Statistics, Institute of Mathematical Statistics, A Para\^itr

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    CLASSIFICATION OF NEUROANATOMICAL STRUCTURES BASED ON NON-EUCLIDEAN GEOMETRIC OBJECT PROPERTIES

    Get PDF
    Studying the observed morphological differences in neuroanatomical structures between individuals with neurodevelopmental disorders and a control group of typically developing individuals has been an important objective. Researchers study the differences with two goals: to assist an accurate diagnosis of the disease and to gain insights into underlying mechanisms of the disease that cause such changes. Shape classification is commonly utilized in such studies. An effective classification is difficult because it requires 1) a choice of an object model that can provide rich geometric object properties (GOPs) relevant for a given classification task, and 2) a choice of a statistical classification method that accounts for the non-Euclidean nature of GOPs. I lay out my methodological contributions to address the aforementioned challenges in the context of early diagnosis and detection of Autism Spectrum Disorder (ASD) in infants based on shapes of hippocampi and caudate nuclei; morphological deviations in these structures between individuals with ASD and typically developing individuals have been reported in the literature. These contributions respectively lead to 1) an effective modeling of shapes of objects of interest and 2) an effective classification. As the first contribution for modeling shapes of objects, I propose a method to obtain a set of skeletal models called s-reps from a set of 3D objects. First, the method iteratively deforms the object surface via Mean Curvature Flow (MCF) until the deformed surface is approximately ellipsoidal. Then, an s-rep of the approximate ellipsoid is obtained analytically. Finally, the ellipsoid s-rep is deformed via a series of inverse MCF transformations. The method has two important properties: 1) it is fully automatic, and 2) it yields a set of s-reps with good correspondence across the set. The method is shown effective in generating a set of s-reps for a few neuroanatomical structures. As the second contribution with respect to modeling shapes of objects, I introduce an extension to the current s-rep for representing an object with a narrowing sharp tail. This includes a spoke interpolation method for interpolating a discrete s-rep of an object with a narrowing sharp tail into a continuous object. This extension is necessary for representing surface geometry of objects whose boundary has a singular point. I demonstrate that this extension allows appropriate surface modeling of a narrowing sharp tail region of the caudate nucleus. In addition, I show that the extension is beneficial in classifying autistic and non-autistic infants at high risk of ASD based on shapes of caudate nuclei. As the first contribution with respect to statistical methods, I propose a novel shape classification framework that uses the s-rep to capture rich localized geometric descriptions of an object, a statistical method called Principal Nested Spheres (PNS) analysis to handle the non-Euclidean s-rep GOPs, and a classification method called Distance Weighted Discrimination (DWD). I evaluate the effectiveness of the proposed method in classifying autistic and non-autistic infants based on either hippocampal shapes or caudate shapes in terms of the Area Under the ROC curve (AUC). In addition, I show that the proposed method is superior to commonly used shape classification methods in the literature. As my final methodological contribution, I extend the proposed shape classification method to perform the classifcation task based on temporal shape differences. DWD learns a class separation direction based on the temporal shape differences that are obtained by taking differences of the temporal pair of Euclideanized s-reps. In the context of early diagnosis and detection of ASD in young infants, the proposed temporal shape difference classification produces some interesting results; the temporal differences in shapes of hippocampi and caudate nuclei do not seem to be as predictive as the cross-sectional shape of these structures alone.Doctor of Philosoph

    Shape Deformation Statistics and Regional Texture-Based Appearance Models for Segmentation

    Get PDF
    Transferring identified regions of interest (ROIs) from planning-time MRI images to the trans-rectal ultrasound (TRUS) images used to guide prostate biopsy is difficult because of the large difference in appearance between the two modalities as well as the deformation of the prostate's shape caused by the TRUS transducer. This dissertation describes methods for addressing these difficulties by both estimating a patient's prostate shape after the transducer is applied and then locating it in the TRUS image using skeletal models (s-reps) of prostate shapes. First, I introduce a geometrically-based method for interpolating discretely sampled s-reps into continuous objects. This interpolation is important for many tasks involving s-reps, including fitting them to new objects as well as the later applications described in this dissertation. This method is shown to be accurate for ellipsoids where an analytical solution is known. Next, I create a method for estimating a probability distribution on the difference between two shapes. Because s-reps live in a high-dimensional curved space, I use Principal Nested Spheres (PNS) to transform these representations to instead live in a flat space where standard techniques can be applied. This method is shown effective both on synthetic data as well as for modeling the deformation caused by the TRUS transducer to the prostate. In cases where appearance is described via a large number of parameters, such as intensity combined with multiple texture features, it is computationally beneficial to be able to turn these large tuples of descriptors into a scalar value. Using the inherent localization properties of s-reps, I develop a method for using regionally-trained classifiers to turn appearance tuples into the probability that the appearance tuple in question came from inside the prostate boundary. This method is shown to be able to accurately discern inside appearances from outside appearances over a large majority of the prostate boundary. Finally, I combine these techniques into a deformable model-based segmentation framework to segment the prostate in TRUS. By applying the learned mean deformation to a patient's prostate and then deforming it so that voxels with high probability of coming from the prostate's interior are also in the model's interior, I am able to generate prostate segmentations which are comparable to state of the art methods.Doctor of Philosoph

    Méthodes numériques et statistiques pour l'analyse de trajectoire dans un cadre de geométrie Riemannienne.

    Get PDF
    This PhD proposes new Riemannian geometry tools for the analysis of longitudinal observations of neuro-degenerative subjects. First, we propose a numerical scheme to compute the parallel transport along geodesics. This scheme is efficient as long as the co-metric can be computed efficiently. Then, we tackle the issue of Riemannian manifold learning. We provide some minimal theoretical sanity checks to illustrate that the procedure of Riemannian metric estimation can be relevant. Then, we propose to learn a Riemannian manifold so as to model subject's progressions as geodesics on this manifold. This allows fast inference, extrapolation and classification of the subjects.Cette thèse porte sur l'élaboration d'outils de géométrie riemannienne et de leur application en vue de la modélisation longitudinale de sujets atteints de maladies neuro-dégénératives. Dans une première partie, nous prouvons la convergence d'un schéma numérique pour le transport parallèle. Ce schéma reste efficace tant que l'inverse de la métrique peut être calculé rapidement. Dans une deuxième partie, nous proposons l'apprentissage une variété et une métrique riemannienne. Après quelques résultats théoriques encourageants, nous proposons d'optimiser la modélisation de progression de sujets comme des géodésiques sur cette variété

    Curve Registration and Human Connectome Data

    Get PDF
    This thesis consists of three main parts: the usefulness of principal nested spheres for time warped functional data analysis, asymptotic study of the Fisher-Rao approach to time warped curve registration, the Joint and Individual Variation Explained method for Human Connectome Data. There are often two important types of variation in functional data: the horizontal (or phase) variation and the vertical (or amplitude) variation. These two types of variation have been appropriately separated and modeled through a domain warping method (or curve registration) based on the Fisher-Rao metric. The first part focuses on the analysis of the horizontal variation, captured by the domain warping functions. The square-root velocity function representation transforms the manifold of the warping functions to a Hilbert sphere. Motivated by recent results on manifold analogs of principal component analysis, we analyze the horizontal variation via a Principal Nested Spheres approach. Compared with earlier approaches, such as approximating tangent plane principal component analysis, this is seen to be an efficient and interpretable approach to decompose the horizontal variation in both simulated and real data examples. The mathematical underpinnings of the Fisher-Rao curve registration are studied by a consistency result for a signal that is observed under random warps, scaling and vertical translation. The signal estimator in the Fisher-Rao curve registration is known to be consistent. The second part of this dissertation studies more asymptotic properties. The ultimate goal is to compare available methods using rates of convergence. A challenging part is that closed form solutions on the surface of the sphere are generally not available. We study a simple case where the warps are piecewise linear warping functions. Points on the unit circle can represent each warp and we find the explicit solution and study the asymptotic properties of the signal estimation. A class of metrics that share some good properties of the Fisher-Rao metric is also studied. A major goal in neuroscience is to understand the neural pathways underlying human behavior. We introduce the recently developed Joint and Individual Variation Explained (JIVE) method to the neuroscience community to simultaneously analyze imaging and behavioral data from the Human Connectome Project. Motivated by recent computational and theoretical improvements in the JIVE approach, we simultaneously explore the joint and individual variation between and within imaging and behavioral data. In particular, we demonstrate that JIVE is an effective and efficient approach for integrating task fMRI and behavioral variables using three examples: one example where task variation is strong, one where task variation is weak and a reference case where the behavior is not directly related to the image. These examples are provided to visualize the different levels of signal found in the joint variation including working memory regions in the image data and accuracy and response time from the in-task behavioral variables. Joint analysis provides insights not available from conventional single block decomposition methods such as Singular Value Decomposition. Additionally, the joint variation estimated by JIVE appears to more clearly identify the working memory regions than Partial Least Squares (PLS), while Canonical Correlation Analysis (CCA) gives grossly overfit results. The individual variation in JIVE captures the behavior unrelated signals such as a background activation that is spatially homogeneous and activation in the default mode network. The information revealed by this individual variation is not examined in traditional methods such as CCA and PLS. We suggest that JIVE can be used as an alternative to PLS and CCA to improve estimation of the signal common to two or more datasets and reveal novel insights into the signal unique to each dataset.Doctor of Philosoph

    Principal Component Analysis in Phylogenetic Tree Space

    Get PDF
    Complex data objects arise in many fields of modern science including drug discovery, psychology, dynamics of gene expression and anatomy. Object oriented data analysis describes the statistical analysis of a population of complex data objects. The specific case of tree-structured data objects is a large end promising research area with many interesting questions and challenging problems. This dissertation focuses on principal component analysis in the tree space introduced by Billera, Holmes, and Vogtmann. Principal component analysis has been a widely used method in aiding visualization and reducing dimensions, and it is natural to extend this type of analysis into tree space. In this dissertation, we will discuss three interesting approaches to this extension. The first approach is multidimensional scaling, which focuses on better visualization of data in tree space, in particular, the out-of-sample embedding problem which inserts additional points into previously constructed multidimensional scaling configurations. It is shown that a better visualization can be achieved by choosing a higher dimensional embedding space and displaying only the first two dimensions. The other two approaches rely on our novel definitions of tree space line, and it is proven that there are only two types of such lines. The second approach is sample-limited geodesic which is an analog of the first type of line. This idea defines the first principal component for a set of trees by maximizing the data projection variance over geodesic segments connecting pairs of trees. Our study shows that the sample-limited geodesic is not an effective principal component object in terms of capturing data variation, due to the intrinsic geometry of the data used in this dissertation, and it is not natural to be generalized into higher-order principal component objects. The third approach is based on the principal ray set, which is a representative of the second type of line. We develop some heuristic searching algorithms for first order principal ray sets and higher order principal axis sets, which are special cases of principal ray sets. Principal ray sets are better summaries for less variable data, but gain very limited information for data with larger spread.Doctor of Philosoph
    corecore