1,206 research outputs found
On the complexity of the representation of simplicial complexes by trees
International audienceIn this paper, we investigate the problem of the representation of simplicial complexes by trees.We introduce and analyze local and global tree representations.We prove that the global tree representation is more efficient in terms of time complexity for searching a given simplex and we show that the local tree representation is more efficient in terms of size of the structure.The simplicial complexes are modeled by hypergraphs.We then prove that the associated combinatorial optimization problems are very difficult to solve and to approximate even if the set of maximal simplices induces a planar graph of maximum degree at most three or a bounded degree hypergraph.However, we prove polynomial time algorithms that compute constant factor approximations and optimal solutions for some classes of instances
An Efficient Representation for Filtrations of Simplicial Complexes
A filtration over a simplicial complex is an ordering of the simplices of
such that all prefixes in the ordering are subcomplexes of . Filtrations
are at the core of Persistent Homology, a major tool in Topological Data
Analysis. In order to represent the filtration of a simplicial complex, the
entire filtration can be appended to any data structure that explicitly stores
all the simplices of the complex such as the Hasse diagram or the recently
introduced Simplex Tree [Algorithmica '14]. However, with the popularity of
various computational methods that need to handle simplicial complexes, and
with the rapidly increasing size of the complexes, the task of finding a
compact data structure that can still support efficient queries is of great
interest.
In this paper, we propose a new data structure called the Critical Simplex
Diagram (CSD) which is a variant of the Simplex Array List (SAL) [Algorithmica
'17]. Our data structure allows one to store in a compact way the filtration of
a simplicial complex, and allows for the efficient implementation of a large
range of basic operations. Moreover, we prove that our data structure is
essentially optimal with respect to the requisite storage space. Finally, we
show that the CSD representation admits fast construction algorithms for Flag
complexes and relaxed Delaunay complexes.Comment: A preliminary version appeared in SODA 201
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
This work introduces a number of algebraic topology approaches, such as
multicomponent persistent homology, multi-level persistent homology and
electrostatic persistence for the representation, characterization, and
description of small molecules and biomolecular complexes. Multicomponent
persistent homology retains critical chemical and biological information during
the topological simplification of biomolecular geometric complexity.
Multi-level persistent homology enables a tailored topological description of
inter- and/or intra-molecular interactions of interest. Electrostatic
persistence incorporates partial charge information into topological
invariants. These topological methods are paired with Wasserstein distance to
characterize similarities between molecules and are further integrated with a
variety of machine learning algorithms, including k-nearest neighbors, ensemble
of trees, and deep convolutional neural networks, to manifest their descriptive
and predictive powers for chemical and biological problems. Extensive numerical
experiments involving more than 4,000 protein-ligand complexes from the PDBBind
database and near 100,000 ligands and decoys in the DUD database are performed
to test respectively the scoring power and the virtual screening power of the
proposed topological approaches. It is demonstrated that the present approaches
outperform the modern machine learning based methods in protein-ligand binding
affinity predictions and ligand-decoy discrimination
Avoiding the Global Sort: A Faster Contour Tree Algorithm
We revisit the classical problem of computing the \emph{contour tree} of a
scalar field , where is a
triangulated simplicial mesh in . The contour tree is a
fundamental topological structure that tracks the evolution of level sets of
and has numerous applications in data analysis and visualization.
All existing algorithms begin with a global sort of at least all critical
values of , which can require (roughly) time. Existing
lower bounds show that there are pathological instances where this sort is
required. We present the first algorithm whose time complexity depends on the
contour tree structure, and avoids the global sort for non-pathological inputs.
If denotes the set of critical points in , the running time is
roughly , where is the depth of in
the contour tree. This matches all existing upper bounds, but is a significant
improvement when the contour tree is short and fat. Specifically, our approach
ensures that any comparison made is between nodes in the same descending path
in the contour tree, allowing us to argue strong optimality properties of our
algorithm.
Our algorithm requires several novel ideas: partitioning in
well-behaved portions, a local growing procedure to iteratively build contour
trees, and the use of heavy path decompositions for the time complexity
analysis
TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions
Although deep learning approaches have had tremendous success in image, video
and audio processing, computer vision, and speech recognition, their
applications to three-dimensional (3D) biomolecular structural data sets have
been hindered by the entangled geometric complexity and biological complexity.
We introduce topology, i.e., element specific persistent homology (ESPH), to
untangle geometric complexity and biological complexity. ESPH represents 3D
complex geometry by one-dimensional (1D) topological invariants and retains
crucial biological information via a multichannel image representation. It is
able to reveal hidden structure-function relationships in biomolecules. We
further integrate ESPH and convolutional neural networks to construct a
multichannel topological neural network (TopologyNet) for the predictions of
protein-ligand binding affinities and protein stability changes upon mutation.
To overcome the limitations to deep learning arising from small and noisy
training sets, we present a multitask topological convolutional neural network
(MT-TCNN). We demonstrate that the present TopologyNet architectures outperform
other state-of-the-art methods in the predictions of protein-ligand binding
affinities, globular protein mutation impacts, and membrane protein mutation
impacts.Comment: 20 pages, 8 figures, 5 table
Building Efficient and Compact Data Structures for Simplicial Complexes
The Simplex Tree (ST) is a recently introduced data structure that can
represent abstract simplicial complexes of any dimension and allows efficient
implementation of a large range of basic operations on simplicial complexes. In
this paper, we show how to optimally compress the Simplex Tree while retaining
its functionalities. In addition, we propose two new data structures called the
Maximal Simplex Tree (MxST) and the Simplex Array List (SAL). We analyze the
compressed Simplex Tree, the Maximal Simplex Tree, and the Simplex Array List
under various settings.Comment: An extended abstract appeared in the proceedings of SoCG 201
- âŠ