Search CORE

12,983 research outputs found

Are there laws of genome evolution?

Author: Koonin Eugene V.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/08/2011
Field of study

Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection, therefore the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as 'laws of evolutionary genomics' in the same sense 'law' is understood in modern physics.Comment: 17 pages, 2 figure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Universal Features in the Genome-level Evolution of Protein Domains

Author: Alessandro L. Sellerio
Bruno Bassetti
Marco Cosentino Lagomarsino
Philip D. Heijning
Publication venue
Publication date: 11/07/2008
Field of study

Protein domains are found on genomes with notable statistical distributions, which bear a high degree of similarity. Previous work has shown how these distributions can be accounted for by simple models, where the main ingredients are probabilities of duplication, innovation, and loss of domains. However, no one so far has addressed the issue that these distributions follow definite trends depending on protein-coding genome size only. We present a stochastic duplication/innovation model, falling in the class of so-called Chinese Restaurant Processes, able to explain this feature of the data. Using only two universal parameters, related to a minimal number of domains and to the relative weight of innovation to duplication, the model reproduces two important aspects: (a) the populations of domain classes (the sets, related to homology classes, containing realizations of the same domain in different proteins) follow common power-laws whose cutoff is dictated by genome size, and (b) the number of domain families is universal and markedly sublinear in genome size. An important ingredient of the model is that the innovation probability decreases with genome size. We propose the possibility to interpret this as a global constraint given by the cost of expanding an increasingly complex interactome. Finally, we introduce a variant of the model where the choice of a new domain relates to its occurrence in genomic data, and thus accounts for fold specificity. Both models have general quantitative agreement with data from hundreds of genomes, which indicates the coexistence of the well-known specificity of proteomes with robust self-organizing phenomena related to the basic evolutionary ``moves'' of duplication and innovation

arXiv.org e-Print Archive

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

Nature Precedings

Evolution signatures in genome network properties

Author: Jos&#xe9
Jose Claudio Moreira
Leonardo Brunnet
Mauro Castro
Ricardo Ferreira
Rita de Almeida
Rodrigo Dalmolin
Publication venue
Publication date: 28/11/2011
Field of study

Genomes maybe organized as networks where protein-protein association plays the role of network links. The resulting networks are far from being random and their topological properties are a consequence of the underlying mechanisms for genome evolution. Considering data on protein-protein association networks from STRING database, we present experimental evidence that degree distribution is not scale free, presenting an increased probability for high degree nodes. We also show that the degree distribution approaches a scale invariant state as the number of genes in the network increases, although real genomes still present finite size effects. Based on the experimental evidence unveiled by these data analyses, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated, with a duplication probability that linearly grows with gene degree and decreases with its clustering coefficient. The results show that topological distributions are better described than in previous genome evolution models. This model correctly predicts that, in order to produce protein-protein association networks with number of links and number of nodes in the observed range, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. If this scenario is true, it implies a universal mechanism for genome evolution

Nature Precedings

Exact reconciliation of undated trees

Author: Kelk Steven
Scornavacca Celine
van Iersel Leo
Publication venue
Publication date: 01/01/2014
Field of study

Reconciliation methods aim at recovering macro evolutionary events and at localizing them in the species history, by observing discrepancies between gene family trees and species trees. In this article we introduce an Integer Linear Programming (ILP) approach for the NP-hard problem of computing a most parsimonious time-consistent reconciliation of a gene tree with a species tree when dating information on speciations is not available. The ILP formulation, which builds upon the DTL model, returns a most parsimonious reconciliation ranging over all possible datings of the nodes of the species tree. By studying its performance on plausible simulated data we conclude that the ILP approach is significantly faster than a brute force search through the space of all possible species tree datings. Although the ILP formulation is currently limited to small trees, we believe that it is an important proof-of-concept which opens the door to the possibility of developing an exact, parsimony based approach to dating species trees. The software (ILPEACE) is freely available for download

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

A model of large-scale proteome evolution

Author: Kepler Thomas B.
Pastor-Satorras Romualdo
Smith Eric
Sole Ricard V.
Publication venue
Publication date: 01/01/2002
Field of study

The next step in the understanding of the genome organization, after the determination of complete sequences, involves proteomics. The proteome includes the whole set of protein-protein interactions, and two recent independent studies have shown that its topology displays a number of surprising features shared by other complex networks, both natural and artificial. In order to understand the origins of this topology and its evolutionary implications, we present a simple model of proteome evolution that is able to reproduce many of the observed statistical regularities reported from the analysis of the yeast proteome. Our results suggest that the observed patterns can be explained by a process of gene duplication and diversification that would evolve proteome networks under a selection pressure, favoring robustness against failure of its individual components

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Graph Theory and Networks in Biology

Author: Mason Oliver
Verwoerd Mark
Publication venue
Publication date: 06/04/2006
Field of study

In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss recent work on identifying and modelling the structure of bio-molecular networks, as well as the application of centrality measures to interaction networks and research on the hierarchical structure of such networks and network motifs. Work on the link between structural network properties and dynamics is also described, with emphasis on synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape

arXiv.org e-Print Archive

CiteSeerX

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

The evolution of genetic architectures underlying quantitative traits

Author: Plotkin Joshua B.
Rajon Etienne
Publication venue: 'The Royal Society'
Publication date: 01/01/2013
Field of study

In the classic view introduced by R. A. Fisher, a quantitative trait is encoded by many loci with small, additive effects. Recent advances in QTL mapping have begun to elucidate the genetic architectures underlying vast numbers of phenotypes across diverse taxa, producing observations that sometimes contrast with Fisher's blueprint. Despite these considerable empirical efforts to map the genetic determinants of traits, it remains poorly understood how the genetic architecture of a trait should evolve, or how it depends on the selection pressures on the trait. Here we develop a simple, population-genetic model for the evolution of genetic architectures. Our model predicts that traits under moderate selection should be encoded by many loci with highly variable effects, whereas traits under either weak or strong selection should be encoded by relatively few loci. We compare these theoretical predictions to qualitative trends in the genetics of human traits, and to systematic data on the genetics of gene expression levels in yeast. Our analysis provides an evolutionary explanation for broad empirical patterns in the genetic basis of traits, and it introduces a single framework that unifies the diversity of observed genetic architectures, ranging from Mendelian to Fisherian.Comment: Minor changes in the text; Added supplementary materia

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

PubMed Central

Hal-Diderot