Manifold-valued data analysis of networks and shapes

Abstract

This thesis is concerned with the study of manifold-valued data analysis. Manifold-valued data is a type of multivariate data that lies on a manifold as opposed to a Euclidean space. We seek to develop analogue classical multivariate analysis methods, which are appropriate for Euclidean data, for data that lie on particular manifolds. A manifold we particularly focus on is the manifold of graph Laplacians. Graph Laplacians can represent networks and for the majority of this thesis we focus on the statistical analysis of samples of networks by identifying networks with their graph Laplacian matrices. We develop a general framework for extrinsic statistical analysis of samples of networks by this representation. For the graph Laplacians we define metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians. This framework provides a way of computing means, performing principal component analysis and regression, carrying out hypothesis tests, such as for testing for equality of means between two samples of networks, and classifying networks. We will demonstrate these methods on many different network datasets, including networks derived from text and neuroimaging data. We also briefly consider another well studied type of manifold-valued data, namely shape data, comparing three commonly used tangent coordinates used in shape analysis and explaining the difference between them and why they may not all be suitable to always use

    Similar works