Search CORE

18 research outputs found

GPU-Accelerated Computation of Vietoris-Rips Persistence Barcodes

Author: Wang Hao
Xiao Mengbai
Zhang Simon
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 36th International Symposium on Computational Geometry (SoCG 2020)
Publication date: 01/01/2020
Field of study

The computation of Vietoris-Rips persistence barcodes is both execution-intensive and memory-intensive. In this paper, we study the computational structure of Vietoris-Rips persistence barcodes, and identify several unique mathematical properties and algorithmic opportunities with connections to the GPU. Mathematically and empirically, we look into the properties of apparent pairs, which are independently identifiable persistence pairs comprising up to 99% of persistence pairs. We give theoretical upper and lower bounds of the apparent pair rate and model the average case. We also design massively parallel algorithms to take advantage of the very large number of simplices that can be processed independently of each other. Having identified these opportunities, we develop a GPU-accelerated software for computing Vietoris-Rips persistence barcodes, called Ripser++. The software achieves up to 30x speedup over the total execution time of the original Ripser and also reduces CPU-memory usage by up to 2.0x. We believe our GPU-acceleration based efforts open a new chapter for the advancement of topological data analysis in the post-Moore's Law era.Comment: 36 pages, 15 figures. To be published in Symposium on Computational Geometry 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Manifold Topology Divergence: a Framework for Comparing Data Manifolds

Author: Barannikov Serguei
Burnaev Evgeny
Filippov Alexander
Korotin Alexander
Sotnikov Grigorii
Trimbach Ekaterina
Trofimov Ilya
Publication venue
Publication date: 28/10/2021
Field of study

We develop a framework for comparing data manifolds, aimed, in particular, towards the evaluation of deep generative models. We describe a novel tool, Cross-Barcode(P,Q), that, given a pair of distributions in a high-dimensional space, tracks multiscale topology spacial discrepancies between manifolds on which the distributions are concentrated. Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence) and apply it to assess the performance of deep generative models in various domains: images, 3D-shapes, time-series, and on different datasets: MNIST, Fashion MNIST, SVHN, CIFAR10, FFHQ, chest X-ray images, market stock data, ShapeNet. We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance. Our algorithm scales well (essentially linearly) with the increase of the dimension of the ambient high-dimensional space. It is one of the first TDA-based practical methodologies that can be applied universally to datasets of different sizes and dimensions, including the ones on which the most recent GANs in the visual domain are trained. The proposed method is domain agnostic and does not rely on pre-trained networks

arXiv.org e-Print Archive

Fast Computation of Zigzag Persistence

Author: Dey Tamal K.
Hou Tao
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

Zigzag persistence is a powerful extension of the standard persistence which allows deletions of simplices besides insertions. However, computing zigzag persistence usually takes considerably more time than the standard persistence. We propose an algorithm called FastZigzag which narrows this efficiency gap. Our main result is that an input simplex-wise zigzag filtration can be converted to a cell-wise non-zigzag filtration of a ?-complex with the same length, where the cells are copies of the input simplices. This conversion step in FastZigzag incurs very little cost. Furthermore, the barcode of the original filtration can be easily read from the barcode of the new cell-wise filtration because the conversion embodies a series of diamond switches known in topological data analysis. This seemingly simple observation opens up the vast possibilities for improving the computation of zigzag persistence because any efficient algorithm/software for standard persistence can now be applied to computing zigzag persistence. Our experiment shows that this indeed achieves substantial performance gain over the existing state-of-the-art softwares

Dagstuhl Research Online Publication Server

Learning Topology-Preserving Data Representations

Author: Balabin Nikita
Barannikov Serguei
Burnaev Evgeny
Cherniavskii Daniil
Trofimov Ilya
Tulchinskii Eduard
Publication venue
Publication date: 15/02/2023
Field of study

We propose a method for learning topology-preserving data representations (dimensionality reduction). The method aims to provide topological similarity between the data manifold and its latent representation via enforcing the similarity in topological features (clusters, loops, 2D voids, etc.) and their localization. The core of the method is the minimization of the Representation Topology Divergence (RTD) between original high-dimensional data and low-dimensional representation in latent space. RTD minimization provides closeness in topological features with strong theoretical guarantees. We develop a scheme for RTD differentiation and apply it as a loss term for the autoencoder. The proposed method "RTD-AE" better preserves the global structure and topology of the data manifold than state-of-the-art competitors as measured by linear correlation, triplet distance ranking accuracy, and Wasserstein distance between persistence barcodes

arXiv.org e-Print Archive

Efficient two-parameter persistence computation via cohomology

Author: Bauer Ulrich
Lenzen Fabian
Lesnick Michael
Publication venue
Publication date: 01/01/2023
Field of study

Clearing is a simple but effective optimization for the standard algorithm of persistent homology (PH), which dramatically improves the speed and scalability of PH computations for Vietoris--Rips filtrations. Due to the quick growth of the boundary matrices of a Vietoris--Rips filtration with increasing dimension, clearing is only effective when used in conjunction with a dual (cohomological) variant of the standard algorithm. This approach has not previously been applied successfully to the computation of two-parameter PH. We introduce a cohomological algorithm for computing minimal free resolutions of two-parameter PH that allows for clearing. To derive our algorithm, we extend the duality principles which underlie the one-parameter approach to the two-parameter setting. We provide an implementation and report experimental run times for function-Rips filtrations. Our method is faster than the current state-of-the-art by a factor of up to 20.Comment: This is an extended version of a conference paper that appeared at SoCG 2023, see https://drops.dagstuhl.de/opus/volltexte/2023/1786

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery

Author: Chen Yuzhou
Coskunuzer Baris
Demir Andac
Gel Yulia
Kiziltan Bulent
Segovia-Dominguez Ignacio
Publication venue
Publication date: 07/11/2022
Field of study

In computer-aided drug discovery (CADD), virtual screening (VS) is used for identifying the drug candidates that are most likely to bind to a molecular target in a large library of compounds. Most VS methods to date have focused on using canonical compound representations (e.g., SMILES strings, Morgan fingerprints) or generating alternative fingerprints of the compounds by training progressively more complex variational autoencoders (VAEs) and graph neural networks (GNNs). Although VAEs and GNNs led to significant improvements in VS performance, these methods suffer from reduced performance when scaling to large virtual compound datasets. The performance of these methods has shown only incremental improvements in the past few years. To address this problem, we developed a novel method using multiparameter persistence (MP) homology that produces topological fingerprints of the compounds as multidimensional vectors. Our primary contribution is framing the VS process as a new topology-based graph ranking problem by partitioning a compound into chemical substructures informed by the periodic properties of its atoms and extracting their persistent homology features at multiple resolution levels. We show that the margin loss fine-tuning of pretrained Triplet networks attains highly competitive results in differentiating between compounds in the embedding space and ranking their likelihood of becoming effective drug candidates. We further establish theoretical guarantees for the stability properties of our proposed MP signatures, and demonstrate that our models, enhanced by the MP signatures, outperform state-of-the-art methods on benchmark datasets by a wide and highly statistically significant margin (e.g., 93% gain for Cleves-Jain and 54% gain for DUD-E Diverse dataset).Comment: NeurIPS, 2022 (36th Conference on Neural Information Processing Systems

arXiv.org e-Print Archive

Improving neural networks using topological data analysis

Author: Ballester Bautista Rubén
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/09/2022
Field of study

Generalisation measures are metrics that indicate how well a neural network will perform in presence of unknown data. Differentiable generalisation measures with respect to the parameters of a neural network that use only the training set are candidates to be used as loss regularisation terms to improve neural network training processes. Recently, persistent homology has been used to build robust generalisation measures of this kind by means of persistence diagrams. However, some of these measures involve non-standard distances, and thus the usual stability and differentiability results are not valid. In this thesis, we prove more general stability and differentiability results that fit the conditions required by the previous topological measures. Also, we define a new measure called topological redundancy that we use together with one of the previous topological terms to improve accuracies of networks with respect to usual training without topological regularisation terms

UPCommons. Portal del coneixement obert de la UPC

Memory Clustering Using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-Starts

Author: Dinev Traiko
Havoutis Ioannis
Ivan Vladimir
Merkt Wolfgang Xaver
Vijayakumar Sethu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Shooting methods are an efficient approach to solving nonlinear optimal control problems. As they use local optimization, they exhibit favorable convergence when initialized with a good warm-start but may not converge at all if provided with a poor initial guess. Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space. However, in practice the solutions contain discontinuities introduced by system dynamics or the environment. Additionally, in many cases multiple equally suitable, i.e., multi-modal, solutions exist to solve a problem. Classic learning approaches smooth across the boundary of these discontinuities and thus generalize poorly. In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space. In particular, we introduce a method based on persistent homology to automatically cluster the dataset of precomputed solutions to obtain different candidate initial guesses. We then train a Mixture-of-Experts within each cluster to predict state and control trajectories to warm-start the optimal control solver and provide a comparison with modality-agnostic learning. We demonstrate our method on a cart-pole toy problem and a quadrotor avoiding obstacles, and show that clustering samples based on inherent structure improves the warm-start quality.Comment: 12 pages, 10 figures, accepted as a regular paper in IEEE Transactions on Robotics (T-RO). Supplementary video: https://youtu.be/lUULTWCFxY8 Code: https://github.com/wxmerkt/topological_memory_clustering The first two authors contributed equall

arXiv.org e-Print Archive

Edinburgh Research Explorer

Oxford University Research Archive

Acceptability Judgements via Examining the Topology of Attention Maps

Author: Artemova Ekaterina
Barannikov Serguei
Burnaev Evgeny
Cherniavskii Daniil
Kushnareva Laida
Mikhailov Vladislav
Piontkovskaya Irina
Piontkovski Dmitri
Proskurina Irina
Tulchinskii Eduard
Publication venue
Publication date: 23/10/2022
Field of study

The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the ability of the attention heads to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the geometric properties of the attention graph can be efficiently exploited for two standard practices in linguistics: binary judgments and linguistic minimal pairs. Topological features enhance the BERT-based acceptability classifier scores by

8

24

% on CoLA in three languages (English, Italian, and Swedish). By revealing the topological discrepancy between attention maps of minimal pairs, we achieve the human-level performance on the BLiMP benchmark, outperforming nine statistical and Transformer LM baselines. At the same time, TDA provides the foundation for analyzing the linguistic functions of attention heads and interpreting the correspondence between the graph features and grammatical phenomena.Comment: Accepted to EMNLP 2022 Finding

arXiv.org e-Print Archive