1,842 research outputs found
Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History
Genome-wide protein-protein interaction (PPI) data are readily available
thanks to recent breakthroughs in biotechnology. However, PPI networks of
extant organisms are only snapshots of the network evolution. How to infer the
whole evolution history becomes a challenging problem in computational biology.
In this paper, we present a likelihood-based approach to inferring network
evolution history from the topology of PPI networks and the duplication
relationship among the paralogs. Simulations show that our approach outperforms
the existing ones in terms of the accuracy of reconstruction. Moreover, the
growth parameters of several real PPI networks estimated by our method are more
consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201
Network Archaeology: Uncovering Ancient Networks from Present-day Interactions
Often questions arise about old or extinct networks. What proteins interacted
in a long-extinct ancestor species of yeast? Who were the central players in
the Last.fm social network 3 years ago? Our ability to answer such questions
has been limited by the unavailability of past versions of networks. To
overcome these limitations, we propose several algorithms for reconstructing a
network's history of growth given only the network as it exists today and a
generative model by which the network is believed to have evolved. Our
likelihood-based method finds a probable previous state of the network by
reversing the forward growth model. This approach retains node identities so
that the history of individual nodes can be tracked. We apply these algorithms
to uncover older, non-extant biological and social networks believed to have
grown via several models, including duplication-mutation with complementarity,
forest fire, and preferential attachment. Through experiments on both synthetic
and real-world data, we find that our algorithms can estimate node arrival
times, identify anchor nodes from which new nodes copy links, and can reveal
significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
Computationally Comparing Biological Networks and Reconstructing Their Evolution
Biological networks, such as protein-protein interaction, regulatory, or metabolic networks, provide information about biological function, beyond what can be gleaned from sequence alone. Unfortunately, most computational problems associated with these networks are NP-hard. In this dissertation, we develop algorithms to tackle numerous fundamental problems in the study of biological networks.
First, we present a system for classifying the binding affinity of peptides to a diverse array of immunoglobulin antibodies. Computational approaches to this problem are integral to virtual screening and modern drug discovery. Our system is based on an ensemble of support vector machines and exhibits state-of-the-art performance. It placed 1st in the 2010 DREAM5 competition.
Second, we investigate the problem of biological network alignment. Aligning the biological networks of different species allows for the discovery of shared structures and conserved pathways. We introduce an original procedure for network alignment based on a novel topological node signature. The pairwise global alignments of biological networks produced by our procedure, when evaluated under multiple metrics, are both more accurate and more robust to noise than those of previous work.
Next, we explore the problem of ancestral network reconstruction. Knowing the state of ancestral networks allows us to examine how biological pathways have evolved, and how pathways in extant species have diverged from that of their common ancestor. We describe a novel framework for representing the evolutionary histories of biological networks and present efficient algorithms for reconstructing either a single parsimonious evolutionary history, or an ensemble of near-optimal histories. Under multiple models of network evolution, our approaches are effective at inferring the ancestral network interactions. Additionally, the ensemble approach is robust to noisy input, and can be used to impute missing interactions in experimental data.
Finally, we introduce a framework, GrowCode, for learning network growth models. While previous work focuses on developing growth models manually, or on procedures for learning parameters for existing models, GrowCode learns fundamentally new growth models that match target networks in a flexible and user-defined way. We show that models learned by GrowCode produce networks whose target properties match those of real-world networks more closely than existing models
Phylogenetic analysis of modularity in protein interaction networks
<p>Abstract</p> <p>Background</p> <p>In systems biology, comparative analyses of molecular interactions across diverse species indicate that conservation and divergence of networks can be used to understand functional evolution from a systems perspective. A key characteristic of these networks is their modularity, which contributes significantly to their robustness, as well as adaptability. Consequently, analysis of modular network structures from a phylogenetic perspective may be useful in understanding the emergence, conservation, and diversification of functional modularity.</p> <p>Results</p> <p>In this paper, we propose a phylogenetic framework for analyzing network modules, with applications that extend well beyond network-based phylogeny reconstruction. Our approach is based on identification of modular network components from each network separately, followed by projection of these modules onto the networks of other species to compare different networks. Subsequently, we use the conservation of various modules in each network to assess the similarity between different networks. Compared to traditional methods that rely on topological comparisons, our approach has key advantages in (<it>i</it>) avoiding intractable graph comparison problems in comparative network analysis, (<it>ii</it>) accounting for noise and missing data through flexible treatment of network conservation, and (<it>iii</it>) providing insights on the evolution of biological systems through investigation of the evolutionary trajectories of network modules. We test our method, M<smcaps>OPHY</smcaps>, on synthetic data generated by simulation of network evolution, as well as existing protein-protein interaction data for seven diverse species. Comprehensive experimental results show that M<smcaps>OPHY</smcaps> is promising in reconstructing evolutionary histories of extant networks based on conservation of modularity, it is highly robust to noise, and outperforms existing methods that quantify network similarity in terms of conservation of network topology.</p> <p>Conclusion</p> <p>These results establish modularity and network proximity as useful features in comparative network analysis and motivate detailed studies of the evolutionary histories of network modules.</p
The compositional and evolutionary logic of metabolism
Metabolism displays striking and robust regularities in the forms of
modularity and hierarchy, whose composition may be compactly described. This
renders metabolic architecture comprehensible as a system, and suggests the
order in which layers of that system emerged. Metabolism also serves as the
foundation in other hierarchies, at least up to cellular integration including
bioenergetics and molecular replication, and trophic ecology. The
recapitulation of patterns first seen in metabolism, in these higher levels,
suggests metabolism as a source of causation or constraint on many forms of
organization in the biosphere.
We identify as modules widely reused subsets of chemicals, reactions, or
functions, each with a conserved internal structure. At the small molecule
substrate level, module boundaries are generally associated with the most
complex reaction mechanisms and the most conserved enzymes. Cofactors form a
structurally and functionally distinctive control layer over the small-molecule
substrate. Complex cofactors are often used at module boundaries of the
substrate level, while simpler ones participate in widely used reactions.
Cofactor functions thus act as "keys" that incorporate classes of organic
reactions within biochemistry.
The same modules that organize the compositional diversity of metabolism are
argued to have governed long-term evolution. Early evolution of core
metabolism, especially carbon-fixation, appears to have required few
innovations among a small number of conserved modules, to produce adaptations
to simple biogeochemical changes of environment. We demonstrate these features
of metabolism at several levels of hierarchy, beginning with the small-molecule
substrate and network architecture, continuing with cofactors and key conserved
reactions, and culminating in the aggregation of multiple diverse physical and
biochemical processes in cells.Comment: 56 pages, 28 figure
Bayesian Inference for Duplication-Mutation with Complementarity Network Models
We observe an undirected graph without multiple edges and self-loops,
which is to represent a protein-protein interaction (PPI) network. We assume
that evolved under the duplication-mutation with complementarity (DMC)
model from a seed graph, , and we also observe the binary forest
that represents the duplication history of . A posterior density for the DMC
model parameters is established, and we outline a sampling strategy by which
one can perform Bayesian inference; that sampling strategy employs a particle
marginal Metropolis-Hastings (PMMH) algorithm. We test our methodology on
numerical examples to demonstrate a high accuracy and precision in the
inference of the DMC model's mutation and homodimerization parameters
The inference of gene trees with species trees.
This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution
- …