13 research outputs found
Reduction Algorithms for Persistence Diagrams of Networks: CoralTDA and PrunIT
Topological data analysis (TDA) delivers invaluable and complementary
information on the intrinsic properties of data inaccessible to conventional
methods. However, high computational costs remain the primary roadblock
hindering the successful application of TDA in real-world studies, particularly
with machine learning on large complex networks.
Indeed, most modern networks such as citation, blockchain, and online social
networks often have hundreds of thousands of vertices, making the application
of existing TDA methods infeasible. We develop two new, remarkably simple but
effective algorithms to compute the exact persistence diagrams of large graphs
to address this major TDA limitation. First, we prove that -core of a
graph suffices to compute its persistence diagram,
. Second, we introduce a pruning algorithm for graphs to
compute their persistence diagrams by removing the dominated vertices. Our
experiments on large networks show that our novel approach can achieve
computational gains up to 95%.
The developed framework provides the first bridge between the graph theory
and TDA, with applications in machine learning of large complex networks. Our
implementation is available at
https://github.com/cakcora/PersistentHomologyWithCoralPrunitComment: Spotlight paper at NeurIPS 202
ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning
Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a learnt, unsupervised measure vectorisation method and use it for reflecting underlying changes in topological behaviour in machine learning contexts. Relying on optimal measure quantisation results the method is tailored to efficiently discriminate important plane regions where meaningful differences arise. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with only high level tuning parameters such as the total measure encoding budget, and we provide a completely open access software
Adaptive Topological Feature via Persistent Homology: Filtration Learning for Point Clouds
Machine learning for point clouds has been attracting much attention, with
many applications in various fields, such as shape recognition and material
science. To enhance the accuracy of such machine learning methods, it is known
to be effective to incorporate global topological features, which are typically
extracted by persistent homology. In the calculation of persistent homology for
a point cloud, we need to choose a filtration for the point clouds, an
increasing sequence of spaces. Because the performance of machine learning
methods combined with persistent homology is highly affected by the choice of a
filtration, we need to tune it depending on data and tasks. In this paper, we
propose a framework that learns a filtration adaptively with the use of neural
networks. In order to make the resulting persistent homology
isometry-invariant, we develop a neural network architecture with such
invariance. Additionally, we theoretically show a finite-dimensional
approximation result that justifies our architecture. Experimental results
demonstrated the efficacy of our framework in several classification tasks.Comment: 17 pages with 4 figure
On the Expressivity of Persistent Homology in Graph Learning
Persistent homology, a technique from computational topology, has recently
shown strong empirical performance in the context of graph classification.
Being able to capture long range graph properties via higher-order topological
features, such as cycles of arbitrary length, in combination with multi-scale
topological descriptors, has improved predictive performance for data sets with
prominent topological structures, such as molecules. At the same time, the
theoretical properties of persistent homology have not been formally assessed
in this context. This paper intends to bridge the gap between computational
topology and graph machine learning by providing a brief introduction to
persistent homology in the context of graphs, as well as a theoretical
discussion and empirical analysis of its expressivity for graph learning tasks
Topological Graph Neural Networks
Graph neural networks (GNNs) are a powerful architecture for tackling graph
learning tasks, yet have been shown to be oblivious to eminent substructures,
such as cycles. We present TOGL, a novel layer that incorporates global
topological information of a graph using persistent homology. TOGL can be
easily integrated into any type of GNN and is strictly more expressive in terms
of the Weisfeiler--Lehman test of isomorphism. Augmenting GNNs with our layer
leads to beneficial predictive performance for graph and node classification
tasks, both on synthetic data sets, which can be classified by humans using
their topology but not by ordinary GNNs, and on real-world data
Time-Aware Knowledge Representations of Dynamic Objects with Multidimensional Persistence
Learning time-evolving objects such as multivariate time series and dynamic
networks requires the development of novel knowledge representation mechanisms
and neural network architectures, which allow for capturing implicit
time-dependent information contained in the data. Such information is typically
not directly observed but plays a key role in the learning task performance. In
turn, lack of time dimension in knowledge encoding mechanisms for
time-dependent data leads to frequent model updates, poor learning performance,
and, as a result, subpar decision-making. Here we propose a new approach to a
time-aware knowledge representation mechanism that notably focuses on implicit
time-dependent topological information along multiple geometric dimensions. In
particular, we propose a new approach, named \textit{Temporal MultiPersistence}
(TMP), which produces multidimensional topological fingerprints of the data by
using the existing single parameter topological summaries. The main idea behind
TMP is to merge the two newest directions in topological representation
learning, that is, multi-persistence which simultaneously describes data shape
evolution along multiple key parameters, and zigzag persistence to enable us to
extract the most salient data shape information over time. We derive
theoretical guarantees of TMP vectorizations and show its utility, in
application to forecasting on benchmark traffic flow, Ethereum blockchain, and
electrocardiogram datasets, demonstrating the competitive performance,
especially, in scenarios of limited data records. In addition, our TMP method
improves the computational efficiency of the state-of-the-art multipersistence
summaries up to 59.5 times
Going beyond persistent homology using persistent homology
Representational limits of message-passing graph neural networks (MP-GNNs),
e.g., in terms of the Weisfeiler-Leman (WL) test for isomorphism, are well
understood. Augmenting these graph models with topological features via
persistent homology (PH) has gained prominence, but identifying the class of
attributed graphs that PH can recognize remains open. We introduce a novel
concept of color-separating sets to provide a complete resolution to this
important problem. Specifically, we establish the necessary and sufficient
conditions for distinguishing graphs based on the persistence of their
connected components, obtained from filter functions on vertex and edge colors.
Our constructions expose the limits of vertex- and edge-level PH, proving that
neither category subsumes the other. Leveraging these theoretical insights, we
propose RePHINE for learning topological features on graphs. RePHINE
efficiently combines vertex- and edge-level PH, achieving a scheme that is
provably more powerful than both. Integrating RePHINE into MP-GNNs boosts their
expressive power, resulting in gains over standard PH on several benchmarks for
graph classification.Comment: Accepted to NeurIPS 202
ATOL: Measure Vectorization for Automatic Topologically-Oriented Learning
Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a fast, learnt, unsupervised vectorization method for measures in Euclidean spaces and use it for reflecting underlying changes in topological behaviour in machine learning contexts. The algorithm is simple and efficiently discriminates important space regions where meaningful differences to the mean measure arise. It is proven to be able to separate clusters of persistence diagrams. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with a single high level tuning parameter: the total measure encoding budget. We provide a completely open access software