1,206 research outputs found

    On the complexity of the representation of simplicial complexes by trees

    Get PDF
    International audienceIn this paper, we investigate the problem of the representation of simplicial complexes by trees.We introduce and analyze local and global tree representations.We prove that the global tree representation is more efficient in terms of time complexity for searching a given simplex and we show that the local tree representation is more efficient in terms of size of the structure.The simplicial complexes are modeled by hypergraphs.We then prove that the associated combinatorial optimization problems are very difficult to solve and to approximate even if the set of maximal simplices induces a planar graph of maximum degree at most three or a bounded degree hypergraph.However, we prove polynomial time algorithms that compute constant factor approximations and optimal solutions for some classes of instances

    An Efficient Representation for Filtrations of Simplicial Complexes

    Get PDF
    A filtration over a simplicial complex KK is an ordering of the simplices of KK such that all prefixes in the ordering are subcomplexes of KK. Filtrations are at the core of Persistent Homology, a major tool in Topological Data Analysis. In order to represent the filtration of a simplicial complex, the entire filtration can be appended to any data structure that explicitly stores all the simplices of the complex such as the Hasse diagram or the recently introduced Simplex Tree [Algorithmica '14]. However, with the popularity of various computational methods that need to handle simplicial complexes, and with the rapidly increasing size of the complexes, the task of finding a compact data structure that can still support efficient queries is of great interest. In this paper, we propose a new data structure called the Critical Simplex Diagram (CSD) which is a variant of the Simplex Array List (SAL) [Algorithmica '17]. Our data structure allows one to store in a compact way the filtration of a simplicial complex, and allows for the efficient implementation of a large range of basic operations. Moreover, we prove that our data structure is essentially optimal with respect to the requisite storage space. Finally, we show that the CSD representation admits fast construction algorithms for Flag complexes and relaxed Delaunay complexes.Comment: A preliminary version appeared in SODA 201

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Avoiding the Global Sort: A Faster Contour Tree Algorithm

    Get PDF
    We revisit the classical problem of computing the \emph{contour tree} of a scalar field f:M→Rf:\mathbb{M} \to \mathbb{R}, where M\mathbb{M} is a triangulated simplicial mesh in Rd\mathbb{R}^d. The contour tree is a fundamental topological structure that tracks the evolution of level sets of ff and has numerous applications in data analysis and visualization. All existing algorithms begin with a global sort of at least all critical values of ff, which can require (roughly) Ω(nlog⁥n)\Omega(n\log n) time. Existing lower bounds show that there are pathological instances where this sort is required. We present the first algorithm whose time complexity depends on the contour tree structure, and avoids the global sort for non-pathological inputs. If CC denotes the set of critical points in M\mathbb{M}, the running time is roughly O(∑v∈Clog⁡ℓv)O(\sum_{v \in C} \log \ell_v), where ℓv\ell_v is the depth of vv in the contour tree. This matches all existing upper bounds, but is a significant improvement when the contour tree is short and fat. Specifically, our approach ensures that any comparison made is between nodes in the same descending path in the contour tree, allowing us to argue strong optimality properties of our algorithm. Our algorithm requires several novel ideas: partitioning M\mathbb{M} in well-behaved portions, a local growing procedure to iteratively build contour trees, and the use of heavy path decompositions for the time complexity analysis

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    Building Efficient and Compact Data Structures for Simplicial Complexes

    Get PDF
    The Simplex Tree (ST) is a recently introduced data structure that can represent abstract simplicial complexes of any dimension and allows efficient implementation of a large range of basic operations on simplicial complexes. In this paper, we show how to optimally compress the Simplex Tree while retaining its functionalities. In addition, we propose two new data structures called the Maximal Simplex Tree (MxST) and the Simplex Array List (SAL). We analyze the compressed Simplex Tree, the Maximal Simplex Tree, and the Simplex Array List under various settings.Comment: An extended abstract appeared in the proceedings of SoCG 201
    • 

    corecore