13,933 research outputs found
TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions
Although deep learning approaches have had tremendous success in image, video
and audio processing, computer vision, and speech recognition, their
applications to three-dimensional (3D) biomolecular structural data sets have
been hindered by the entangled geometric complexity and biological complexity.
We introduce topology, i.e., element specific persistent homology (ESPH), to
untangle geometric complexity and biological complexity. ESPH represents 3D
complex geometry by one-dimensional (1D) topological invariants and retains
crucial biological information via a multichannel image representation. It is
able to reveal hidden structure-function relationships in biomolecules. We
further integrate ESPH and convolutional neural networks to construct a
multichannel topological neural network (TopologyNet) for the predictions of
protein-ligand binding affinities and protein stability changes upon mutation.
To overcome the limitations to deep learning arising from small and noisy
training sets, we present a multitask topological convolutional neural network
(MT-TCNN). We demonstrate that the present TopologyNet architectures outperform
other state-of-the-art methods in the predictions of protein-ligand binding
affinities, globular protein mutation impacts, and membrane protein mutation
impacts.Comment: 20 pages, 8 figures, 5 table
From patterned response dependency to structured covariate dependency: categorical-pattern-matching
Data generated from a system of interest typically consists of measurements
from an ensemble of subjects across multiple response and covariate features,
and is naturally represented by one response-matrix against one
covariate-matrix. Likely each of these two matrices simultaneously embraces
heterogeneous data types: continuous, discrete and categorical. Here a matrix
is used as a practical platform to ideally keep hidden dependency among/between
subjects and features intact on its lattice. Response and covariate dependency
is individually computed and expressed through mutliscale blocks via a newly
developed computing paradigm named Data Mechanics. We propose a categorical
pattern matching approach to establish causal linkages in a form of information
flows from patterned response dependency to structured covariate dependency.
The strength of an information flow is evaluated by applying the combinatorial
information theory. This unified platform for system knowledge discovery is
illustrated through five data sets. In each illustrative case, an information
flow is demonstrated as an organization of discovered knowledge loci via
emergent visible and readable heterogeneity. This unified approach
fundamentally resolves many long standing issues, including statistical
modeling, multiple response, renormalization and feature selections, in data
analysis, but without involving man-made structures and distribution
assumptions. The results reported here enhance the idea that linking patterns
of response dependency to structures of covariate dependency is the true
philosophical foundation underlying data-driven computing and learning in
sciences.Comment: 32 pages, 10 figures, 3 box picture
How Gibbs distributions may naturally arise from synaptic adaptation mechanisms. A model-based argumentation
This paper addresses two questions in the context of neuronal networks
dynamics, using methods from dynamical systems theory and statistical physics:
(i) How to characterize the statistical properties of sequences of action
potentials ("spike trains") produced by neuronal networks ? and; (ii) what are
the effects of synaptic plasticity on these statistics ? We introduce a
framework in which spike trains are associated to a coding of membrane
potential trajectories, and actually, constitute a symbolic coding in important
explicit examples (the so-called gIF models). On this basis, we use the
thermodynamic formalism from ergodic theory to show how Gibbs distributions are
natural probability measures to describe the statistics of spike trains, given
the empirical averages of prescribed quantities. As a second result, we show
that Gibbs distributions naturally arise when considering "slow" synaptic
plasticity rules where the characteristic time for synapse adaptation is quite
longer than the characteristic time for neurons dynamics.Comment: 39 pages, 3 figure
Random Recurrent Neural Networks Dynamics
This paper is a review dealing with the study of large size random recurrent
neural networks. The connection weights are selected according to a probability
law and it is possible to predict the network dynamics at a macroscopic scale
using an averaging principle. After a first introductory section, the section 1
reviews the various models from the points of view of the single neuron
dynamics and of the global network dynamics. A summary of notations is
presented, which is quite helpful for the sequel. In section 2, mean-field
dynamics is developed.
The probability distribution characterizing global dynamics is computed. In
section 3, some applications of mean-field theory to the prediction of chaotic
regime for Analog Formal Random Recurrent Neural Networks (AFRRNN) are
displayed. The case of AFRRNN with an homogeneous population of neurons is
studied in section 4. Then, a two-population model is studied in section 5. The
occurrence of a cyclo-stationary chaos is displayed using the results of
\cite{Dauce01}. In section 6, an insight of the application of mean-field
theory to IF networks is given using the results of \cite{BrunelHakim99}.Comment: Review paper, 36 pages, 5 figure
Unified Representation of Molecules and Crystals for Machine Learning
Accurate simulations of atomistic systems from first principles are limited
by computational cost. In high-throughput settings, machine learning can
potentially reduce these costs significantly by accurately interpolating
between reference calculations. For this, kernel learning approaches crucially
require a single Hilbert space accommodating arbitrary atomistic systems. We
introduce a many-body tensor representation that is invariant to translations,
rotations and nuclear permutations of same elements, unique, differentiable,
can represent molecules and crystals, and is fast to compute. Empirical
evidence is presented for energy prediction errors below 1 kcal/mol for 7k
organic molecules and 5 meV/atom for 11k elpasolite crystals. Applicability is
demonstrated for phase diagrams of Pt-group/transition-metal binary systems.Comment: Revised version, minor changes throughou
- …