24,162 research outputs found
Are there laws of genome evolution?
Research in quantitative evolutionary genomics and systems biology led to the
discovery of several universal regularities connecting genomic and molecular
phenomic variables. These universals include the log-normal distribution of the
evolutionary rates of orthologous genes; the power law-like distributions of
paralogous family size and node degree in various biological networks; the
negative correlation between a gene's sequence evolution rate and expression
level; and differential scaling of functional classes of genes with genome
size. The universals of genome evolution can be accounted for by simple
mathematical models similar to those used in statistical physics, such as the
birth-death-innovation model. These models do not explicitly incorporate
selection, therefore the observed universal regularities do not appear to be
shaped by selection but rather are emergent properties of gene ensembles.
Although a complete physical theory of evolutionary biology is inconceivable,
the universals of genome evolution might qualify as 'laws of evolutionary
genomics' in the same sense 'law' is understood in modern physics.Comment: 17 pages, 2 figure
Developmental constraints on vertebrate genome evolution
Constraints in embryonic development are thought to bias the direction of
evolution by making some changes less likely, and others more likely, depending
on their consequences on ontogeny. Here, we characterize the constraints acting
on genome evolution in vertebrates. We used gene expression data from two
vertebrates: zebrafish, using a microarray experiment spanning 14 stages of
development, and mouse, using EST counts for 26 stages of development. We show
that, in both species, genes expressed early in development (1) have a more
dramatic effect of knock-out or mutation and (2) are more likely to revert to
single copy after whole genome duplication, relative to genes expressed late.
This supports high constraints on early stages of vertebrate development,
making them less open to innovations (gene gain or gene loss). Results are
robust to different sources of data-gene expression from microarrays, ESTs, or
in situ hybridizations; and mutants from directed KO, transgenic insertions,
point mutations, or morpholinos. We determine the pattern of these constraints,
which differs from the model used to describe vertebrate morphological
conservation ("hourglass" model). While morphological constraints reach a
maximum at mid-development (the "phylotypic" stage), genomic constraints appear
to decrease in a monotonous manner over developmental time
Evolution signatures in genome network properties
Genomes maybe organized as networks where protein-protein association plays the role of network links. The resulting networks are far from being random and their topological properties are a consequence of the underlying mechanisms for genome evolution. Considering data on protein-protein association networks from STRING database, we present experimental evidence that degree distribution is not scale free, presenting an increased probability for high degree nodes. We also show that the degree distribution approaches a scale invariant state as the number of genes in the network increases, although real genomes still present finite size effects. Based on the experimental evidence unveiled by these data analyses, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated, with a duplication probability that linearly grows with gene degree and decreases with its clustering coefficient. The results show that topological distributions are better described than in previous genome evolution models. This model correctly predicts that, in order to produce protein-protein association networks with number of links and number of nodes in the observed range, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. If this scenario is true, it implies a universal mechanism for genome evolution
A Solvable Sequence Evolution Model and Genomic Correlations
We study a minimal model for genome evolution whose elementary processes are
single site mutation, duplication and deletion of sequence regions and
insertion of random segments. These processes are found to generate long-range
correlations in the composition of letters as long as the sequence length is
growing, i.e., the combined rates of duplications and insertions are higher
than the deletion rate. For constant sequence length, on the other hand, all
initial correlations decay exponentially. These results are obtained
analytically and by simulations. They are compared with the long-range
correlations observed in genomic DNA, and the implications for genome evolution
are discussed.Comment: 4 pages, 4 figure
A Unifying Model of Genome Evolution Under Parsimony
We present a data structure called a history graph that offers a practical
basis for the analysis of genome evolution. It conceptually simplifies the
study of parsimonious evolutionary histories by representing both substitutions
and double cut and join (DCJ) rearrangements in the presence of duplications.
The problem of constructing parsimonious history graphs thus subsumes related
maximum parsimony problems in the fields of phylogenetic reconstruction and
genome rearrangement. We show that tractable functions can be used to define
upper and lower bounds on the minimum number of substitutions and DCJ
rearrangements needed to explain any history graph. These bounds become tight
for a special type of unambiguous history graph called an ancestral variation
graph (AVG), which constrains in its combinatorial structure the number of
operations required. We finally demonstrate that for a given history graph ,
a finite set of AVGs describe all parsimonious interpretations of , and this
set can be explored with a few sampling moves.Comment: 52 pages, 24 figure
Law of Genome Evolution Direction : Coding Information Quantity Grows
The problem of the directionality of genome evolution is studied. Based on
the analysis of C-value paradox and the evolution of genome size we propose
that the function-coding information quantity of a genome always grows in the
course of evolution through sequence duplication, expansion of code, and gene
transfer from outside. The function-coding information quantity of a genome
consists of two parts, p-coding information quantity which encodes functional
protein and n-coding information quantity which encodes other functional
elements except amino acid sequence. The evidences on the evolutionary law
about the function-coding information quantity are listed. The needs of
function is the motive force for the expansion of coding information quantity
and the information quantity expansion is the way to make functional innovation
and extension for a species. So, the increase of coding information quantity of
a genome is a measure of the acquired new function and it determines the
directionality of genome evolution.Comment: 16 page
The genome evolution and domestication of tropical fruit mango
Background: Mango is one of the world’s most important tropical fruits. It belongs to the family Anacardiaceae, which includes several other economically important species, notably cashew, sumac and pistachio from other genera. Many species in this family produce family-specific urushiols and related phenols, which can induce contact dermatitis.
Results: We generate a chromosome-scale genome assembly of mango, providing a reference genome for the Anacardiaceae family. Our results indicate the occurrence of a recent whole-genome duplication (WGD) event in mango. Duplicated genes preferentially retained include photosynthetic, photorespiration, and lipid metabolic genes that may have provided adaptive advantages to sharp historical decreases in atmospheric carbon dioxide and global temperatures. A notable example of an extended gene family is the chalcone synthase (CHS) family of genes, and particular genes in this family show universally higher expression in peels than in flesh, likely for the biosynthesis of urushiols and related phenols. Genome resequencing reveals two distinct groups of mango varieties, with commercial varieties clustered with India germplasms and demonstrating allelic admixture, and indigenous varieties from Southeast Asia in the second group. Landraces indigenous in China formed distinct clades, and some showed admixture in genomes.
Conclusions: Analysis of chromosome-scale mango genome sequences reveals photosynthesis and lipid metabolism are preferentially retained after a recent WGD event, and expansion of CHS genes is likely associated with urushiol biosynthesis in mango. Genome resequencing clarifies two groups of mango varieties, discovers allelic admixture in commercial varieties, and shows distinct genetic background of landraces
- …