6,878 research outputs found
A circuit topology approach to categorizing changes in biomolecular structure
The biological world is composed of folded linear molecules of bewildering topological complexity and diversity. The topology of folded biomolecules such as proteins and ribonucleic acids is often subject to change during biological processes. Despite intense research, we lack a solid mathematical framework that summarizes these operations in a principled manner. Circuit topology, which formalizes the arrangements of intramolecular contacts, serves as a general mathematical framework to analyze the topological characteristics of folded linear molecules. In this work, we translate familiar molecular operations in biology, such as duplication, permutation, and elimination of contacts, into the language of circuit topology. We show that for such operations there are corresponding matrix representations as well as basic rules that serve as a foundation for understanding these operations within the context of a coherent algebraic framework. We present several biological examples and provide a simple computational framework for creating and analyzing the circuit diagrams of proteins and nucleic acids. We expect our study and future developments in this direction to facilitate a deeper understanding of natural molecular processes and to provide guidance to engineers for generating complex polymeric materials
The identifiability of tree topology for phylogenetic models, including covarion and mixture models
For a model of molecular evolution to be useful for phylogenetic inference,
the topology of evolutionary trees must be identifiable. That is, from a joint
distribution the model predicts, it must be possible to recover the tree
parameter. We establish tree identifiability for a number of phylogenetic
models, including a covarion model and a variety of mixture models with a
limited number of classes. The proof is based on the introduction of a more
general model, allowing more states at internal nodes of the tree than at
leaves, and the study of the algebraic variety formed by the joint
distributions to which it gives rise. Tree identifiability is first established
for this general model through the use of certain phylogenetic invariants.Comment: 20 pages, 1 figur
TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions
Although deep learning approaches have had tremendous success in image, video
and audio processing, computer vision, and speech recognition, their
applications to three-dimensional (3D) biomolecular structural data sets have
been hindered by the entangled geometric complexity and biological complexity.
We introduce topology, i.e., element specific persistent homology (ESPH), to
untangle geometric complexity and biological complexity. ESPH represents 3D
complex geometry by one-dimensional (1D) topological invariants and retains
crucial biological information via a multichannel image representation. It is
able to reveal hidden structure-function relationships in biomolecules. We
further integrate ESPH and convolutional neural networks to construct a
multichannel topological neural network (TopologyNet) for the predictions of
protein-ligand binding affinities and protein stability changes upon mutation.
To overcome the limitations to deep learning arising from small and noisy
training sets, we present a multitask topological convolutional neural network
(MT-TCNN). We demonstrate that the present TopologyNet architectures outperform
other state-of-the-art methods in the predictions of protein-ligand binding
affinities, globular protein mutation impacts, and membrane protein mutation
impacts.Comment: 20 pages, 8 figures, 5 table
Proto-Plasm: parallel language for adaptive and scalable modelling of biosystems
This paper discusses the design goals and the first developments of
Proto-Plasm, a novel computational environment to produce libraries
of executable, combinable and customizable computer models of natural and
synthetic biosystems, aiming to provide a supporting framework for predictive
understanding of structure and behaviour through multiscale geometric modelling
and multiphysics simulations. Admittedly, the Proto-Plasm platform is
still in its infancy. Its computational framework—language, model library,
integrated development environment and parallel engine—intends to provide
patient-specific computational modelling and simulation of organs and biosystem,
exploiting novel functionalities resulting from the symbolic combination of
parametrized models of parts at various scales. Proto-Plasm may define
the model equations, but it is currently focused on the symbolic description of
model geometry and on the parallel support of simulations. Conversely, CellML
and SBML could be viewed as defining the behavioural functions (the model
equations) to be used within a Proto-Plasm program. Here we exemplify
the basic functionalities of Proto-Plasm, by constructing a schematic
heart model. We also discuss multiscale issues with reference to the geometric
and physical modelling of neuromuscular junctions
Quantifying the connectivity of a network: The network correlation function method
Networks are useful for describing systems of interacting objects, where the
nodes represent the objects and the edges represent the interactions between
them. The applications include chemical and metabolic systems, food webs as
well as social networks. Lately, it was found that many of these networks
display some common topological features, such as high clustering, small
average path length (small world networks) and a power-law degree distribution
(scale free networks). The topological features of a network are commonly
related to the network's functionality. However, the topology alone does not
account for the nature of the interactions in the network and their strength.
Here we introduce a method for evaluating the correlations between pairs of
nodes in the network. These correlations depend both on the topology and on the
functionality of the network. A network with high connectivity displays strong
correlations between its interacting nodes and thus features small-world
functionality. We quantify the correlations between all pairs of nodes in the
network, and express them as matrix elements in the correlation matrix. From
this information one can plot the correlation function for the network and to
extract the correlation length. The connectivity of a network is then defined
as the ratio between this correlation length and the average path length of the
network. Using this method we distinguish between a topological small world and
a functional small world, where the latter is characterized by long range
correlations and high connectivity. Clearly, networks which share the same
topology, may have different connectivities, based on the nature and strength
of their interactions. The method is demonstrated on metabolic networks, but
can be readily generalized to other types of networks.Comment: 10 figure
A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations
Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model is consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25%, and the total processor idle time by 85%
- …