2,202 research outputs found
Persistence Bag-of-Words for Topological Data Analysis
Persistent homology (PH) is a rigorous mathematical theory that provides a
robust descriptor of data in the form of persistence diagrams (PDs). PDs
exhibit, however, complex structure and are difficult to integrate in today's
machine learning workflows. This paper introduces persistence bag-of-words: a
novel and stable vectorized representation of PDs that enables the seamless
integration with machine learning. Comprehensive experiments show that the new
representation achieves state-of-the-art performance and beyond in much less
time than alternative approaches.Comment: Accepted for the Twenty-Eight International Joint Conference on
Artificial Intelligence (IJCAI-19). arXiv admin note: substantial text
overlap with arXiv:1802.0485
Persistence-based Pooling for Shape Pose Recognition
International audienceIn this paper, we propose a novel pooling approach for shape classification and recognition using the bag-of-words pipeline, based on topological persistence, a recent tool from Topological Data Analysis. Our technique extends the standard max-pooling, which summarizes the distribution of a visual feature with a single number, thereby losing any notion of spatiality. Instead, we propose to use topological persistence, and the derived persistence diagrams, to provide significantly more informative and spatially sensitive characterizations of the feature functions, which can lead to better recognition performance. Unfortunately, despite their conceptual appeal, persistence diagrams are difficult to handle , since they are not naturally represented as vectors in Euclidean space and even the standard metric, the bottleneck distance is not easy to compute. Furthermore, classical distances between diagrams, such as the bottleneck and Wasserstein distances, do not allow to build positive definite kernels that can be used for learning. To handle this issue, we provide a novel way to transform persistence diagrams into vectors, in which comparisons are trivial. Finally, we demonstrate the performance of our construction on the Non-Rigid 3D Human Models SHREC 2014 dataset, where we show that topological pooling can provide significant improvements over the standard pooling methods for the shape pose recognition within the bag-of-words pipeline
Persistent topology of the reionisation bubble network. I: Formalism & Phenomenology
We present a new formalism for studying the topology of HII regions during
the Epoch of Reionisation, based on persistent homology theory. With persistent
homology, it is possible to follow the evolution of topological features over
time. We introduce the notion of a persistence field as a statistical summary
of persistence data and we show how these fields can be used to identify
different stages of reionisation. We identify two new stages common to all
bubble ionisation scenarios. Following an initial pre-overlap and subsequent
overlap stage, the topology is first dominated by neutral filaments (filament
stage) and then by enclosed patches of neutral hydrogen undergoing outside-in
ionisation (patch stage). We study how these stages are affected by the degree
of galaxy clustering. We also show how persistence fields can be used to study
other properties of the ionisation topology, such as the bubble size
distribution and the fractal-like topology of the largest ionised region.Comment: 18 pages, 12 figures, 1 table. Submitted to MNRA
Persistence codebooks for topological data analysis
Persistent homology is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs) which are 2D multisets of points. Their variable size makes them, however, difficult to combine with typical machine learning workflows. In this paper we introduce persistence codebooks, a novel expressive and discriminative fixed-size vectorized representation of PDs that adapts to the inherent sparsity of persistence diagrams. To this end, we adapt bag-of-words, vectors of locally aggregated descriptors and Fischer vectors for the quantization of PDs. Persistence codebooks represent PDs in a convenient way for machine learning and statistical analysis and have a number of favorable practical and theoretical properties including 1-Wasserstein stability. We evaluate the presented representations on several heterogeneous datasets and show their (high) discriminative power. Our approach yields comparable-and partly even higher-performance in much less time than alternative approaches
- …