858 research outputs found

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    Interpretable statistics for complex modelling: quantile and topological learning

    Get PDF
    As the complexity of our data increased exponentially in the last decades, so has our need for interpretable features. This thesis revolves around two paradigms to approach this quest for insights. In the first part we focus on parametric models, where the problem of interpretability can be seen as a “parametrization selection”. We introduce a quantile-centric parametrization and we show the advantages of our proposal in the context of regression, where it allows to bridge the gap between classical generalized linear (mixed) models and increasingly popular quantile methods. The second part of the thesis, concerned with topological learning, tackles the problem from a non-parametric perspective. As topology can be thought of as a way of characterizing data in terms of their connectivity structure, it allows to represent complex and possibly high dimensional through few features, such as the number of connected components, loops and voids. We illustrate how the emerging branch of statistics devoted to recovering topological structures in the data, Topological Data Analysis, can be exploited both for exploratory and inferential purposes with a special emphasis on kernels that preserve the topological information in the data. Finally, we show with an application how these two approaches can borrow strength from one another in the identification and description of brain activity through fMRI data from the ABIDE project

    A semidiscrete version of the Citti-Petitot-Sarti model as a plausible model for anthropomorphic image reconstruction and pattern recognition

    Full text link
    In his beautiful book [66], Jean Petitot proposes a sub-Riemannian model for the primary visual cortex of mammals. This model is neurophysiologically justified. Further developments of this theory lead to efficient algorithms for image reconstruction, based upon the consideration of an associated hypoelliptic diffusion. The sub-Riemannian model of Petitot and Citti-Sarti (or certain of its improvements) is a left-invariant structure over the group SE(2)SE(2) of rototranslations of the plane. Here, we propose a semi-discrete version of this theory, leading to a left-invariant structure over the group SE(2,N)SE(2,N), restricting to a finite number of rotations. This apparently very simple group is in fact quite atypical: it is maximally almost periodic, which leads to much simpler harmonic analysis compared to SE(2).SE(2). Based upon this semi-discrete model, we improve on previous image-reconstruction algorithms and we develop a pattern-recognition theory that leads also to very efficient algorithms in practice.Comment: 123 pages, revised versio

    Synthesis and Optimization of Reversible Circuits - A Survey

    Full text link
    Reversible logic circuits have been historically motivated by theoretical research in low-power electronics as well as practical improvement of bit-manipulation transforms in cryptography and computer graphics. Recently, reversible circuits have attracted interest as components of quantum algorithms, as well as in photonic and nano-computing technologies where some switching devices offer no signal gain. Research in generating reversible logic distinguishes between circuit synthesis, post-synthesis optimization, and technology mapping. In this survey, we review algorithmic paradigms --- search-based, cycle-based, transformation-based, and BDD-based --- as well as specific algorithms for reversible synthesis, both exact and heuristic. We conclude the survey by outlining key open challenges in synthesis of reversible and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table

    Solving Algorithmic Problems in Finitely Presented Groups via Machine Learning

    Full text link
    Machine learning and pattern recognition techniques have been successfully applied to algorithmic problems in free groups. In this dissertation, we seek to extend these techniques to finitely presented non-free groups, in particular to polycyclic and metabelian groups that are of interest to non-commutative cryptography. As a prototypical example, we utilize supervised learning methods to construct classifiers that can solve the conjugacy decision problem, i.e., determine whether or not a pair of elements from a specified group are conjugate. The accuracies of classifiers created using decision trees, random forests, and N-tuple neural network models are evaluated for several non-free groups. The very high accuracy of these classifiers suggests an underlying mathematical relationship with respect to conjugacy in the tested groups. In addition to testing these techniques on several well-known finitely presented groups, we introduce a new family of metabelian groups for which we analyze the computational complexity of the conjugacy search problem. We prove that for the family in general the time complexity of the conjugacy search problem is exponential, while for a subfamily the problem is polynomial. We also show that for some of these groups the conjugacy search problem is an instance of the discrete logarithm problem. We also apply machine learning techniques to solving the conjugacy search problem. For each platform group we train a N-tuple regression network that can produce a candidate conjugator for a pair of conjugate elements. This candidate is then used as the initial state of a local search for a conjugator in the Cayley graph, in what we call regression-based conjugacy search (RBCS). RBCS can be applied to groups such as polycyclic groups for which other heuristic approaches, such as the length-based attack, are ineffective

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Non-acyclicity of coset lattices and generation of finite groups

    Get PDF
    • …
    corecore