13,933 research outputs found

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    From patterned response dependency to structured covariate dependency: categorical-pattern-matching

    Get PDF
    Data generated from a system of interest typically consists of measurements from an ensemble of subjects across multiple response and covariate features, and is naturally represented by one response-matrix against one covariate-matrix. Likely each of these two matrices simultaneously embraces heterogeneous data types: continuous, discrete and categorical. Here a matrix is used as a practical platform to ideally keep hidden dependency among/between subjects and features intact on its lattice. Response and covariate dependency is individually computed and expressed through mutliscale blocks via a newly developed computing paradigm named Data Mechanics. We propose a categorical pattern matching approach to establish causal linkages in a form of information flows from patterned response dependency to structured covariate dependency. The strength of an information flow is evaluated by applying the combinatorial information theory. This unified platform for system knowledge discovery is illustrated through five data sets. In each illustrative case, an information flow is demonstrated as an organization of discovered knowledge loci via emergent visible and readable heterogeneity. This unified approach fundamentally resolves many long standing issues, including statistical modeling, multiple response, renormalization and feature selections, in data analysis, but without involving man-made structures and distribution assumptions. The results reported here enhance the idea that linking patterns of response dependency to structures of covariate dependency is the true philosophical foundation underlying data-driven computing and learning in sciences.Comment: 32 pages, 10 figures, 3 box picture

    How Gibbs distributions may naturally arise from synaptic adaptation mechanisms. A model-based argumentation

    Get PDF
    This paper addresses two questions in the context of neuronal networks dynamics, using methods from dynamical systems theory and statistical physics: (i) How to characterize the statistical properties of sequences of action potentials ("spike trains") produced by neuronal networks ? and; (ii) what are the effects of synaptic plasticity on these statistics ? We introduce a framework in which spike trains are associated to a coding of membrane potential trajectories, and actually, constitute a symbolic coding in important explicit examples (the so-called gIF models). On this basis, we use the thermodynamic formalism from ergodic theory to show how Gibbs distributions are natural probability measures to describe the statistics of spike trains, given the empirical averages of prescribed quantities. As a second result, we show that Gibbs distributions naturally arise when considering "slow" synaptic plasticity rules where the characteristic time for synapse adaptation is quite longer than the characteristic time for neurons dynamics.Comment: 39 pages, 3 figure

    Random Recurrent Neural Networks Dynamics

    Full text link
    This paper is a review dealing with the study of large size random recurrent neural networks. The connection weights are selected according to a probability law and it is possible to predict the network dynamics at a macroscopic scale using an averaging principle. After a first introductory section, the section 1 reviews the various models from the points of view of the single neuron dynamics and of the global network dynamics. A summary of notations is presented, which is quite helpful for the sequel. In section 2, mean-field dynamics is developed. The probability distribution characterizing global dynamics is computed. In section 3, some applications of mean-field theory to the prediction of chaotic regime for Analog Formal Random Recurrent Neural Networks (AFRRNN) are displayed. The case of AFRRNN with an homogeneous population of neurons is studied in section 4. Then, a two-population model is studied in section 5. The occurrence of a cyclo-stationary chaos is displayed using the results of \cite{Dauce01}. In section 6, an insight of the application of mean-field theory to IF networks is given using the results of \cite{BrunelHakim99}.Comment: Review paper, 36 pages, 5 figure

    Unified Representation of Molecules and Crystals for Machine Learning

    Get PDF
    Accurate simulations of atomistic systems from first principles are limited by computational cost. In high-throughput settings, machine learning can potentially reduce these costs significantly by accurately interpolating between reference calculations. For this, kernel learning approaches crucially require a single Hilbert space accommodating arbitrary atomistic systems. We introduce a many-body tensor representation that is invariant to translations, rotations and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute. Empirical evidence is presented for energy prediction errors below 1 kcal/mol for 7k organic molecules and 5 meV/atom for 11k elpasolite crystals. Applicability is demonstrated for phase diagrams of Pt-group/transition-metal binary systems.Comment: Revised version, minor changes throughou
    • …
    corecore