5,979 research outputs found
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
A large scale prediction of bacteriocin gene blocks suggests a wide functional spectrum for bacteriocins
Bacteriocins are peptide-derived molecules produced by bacteria, whose
recently-discovered functions include virulence factors and signalling
molecules as well as their better known roles as antibiotics. To date, close to
five hundred bacteriocins have been identified and classified. Recent
discoveries have shown that bacteriocins are highly diverse and widely
distributed among bacterial species. Given the heterogeneity of bacteriocin
compounds, many tools struggle with identifying novel bacteriocins due to their
vast sequence and structural diversity. Many bacteriocins undergo
post-translational processing or modifications necessary for the biosynthesis
of the final mature form. Enzymatic modification of bacteriocins as well as
their export is achieved by proteins whose genes are often located in a
discrete gene cluster proximal to the bacteriocin precursor gene, referred to
as \textit{context genes} in this study. Although bacteriocins themselves are
structurally diverse, context genes have been shown to be largely conserved
across unrelated species. Using this knowledge, we set out to identify new
candidates for context genes which may clarify how bacteriocins are
synthesized, and identify new candidates for bacteriocins that bear no sequence
similarity to known toxins. To achieve these goals, we have developed a
software tool, Bacteriocin Operon and gene block Associator (BOA) that can
identify homologous bacteriocin associated gene clusters and predict novel
ones. We discover that several phyla have a strong preference for bactericon
genes, suggesting distinct functions for this group of molecules. Availability:
https://github.com/idoerg/BOAComment: Accepted for publication in BMC Bioinformatic
The Mathematics of Phylogenomics
The grand challenges in biology today are being shaped by powerful
high-throughput technologies that have revealed the genomes of many organisms,
global expression patterns of genes and detailed information about variation
within populations. We are therefore able to ask, for the first time,
fundamental questions about the evolution of genomes, the structure of genes
and their regulation, and the connections between genotypes and phenotypes of
individuals. The answers to these questions are all predicated on progress in a
variety of computational, statistical, and mathematical fields.
The rapid growth in the characterization of genomes has led to the
advancement of a new discipline called Phylogenomics. This discipline results
from the combination of two major fields in the life sciences: Genomics, i.e.,
the study of the function and structure of genes and genomes; and Molecular
Phylogenetics, i.e., the study of the hierarchical evolutionary relationships
among organisms and their genomes. The objective of this article is to offer
mathematicians a first introduction to this emerging field, and to discuss
specific mathematical problems and developments arising from phylogenomics.Comment: 41 pages, 4 figure
Strategies for Reliable Exploitation of Evolutionary Concepts in High Throughput Biology
The recent availability of the complete genome sequences of a large number of model organisms, together with the immense amount of data being produced by the new high-throughput technologies, means that we can now begin comparative analyses to understand the mechanisms involved in the evolution of the genome and their consequences in the study of biological systems. Phylogenetic approaches provide a unique conceptual framework for performing comparative analyses of all this data, for propagating information between different systems and for predicting or inferring new knowledge. As a result, phylogeny-based inference systems are now playing an increasingly important role in most areas of high throughput genomics, including studies of promoters (phylogenetic footprinting), interactomes (based on the presence and degree of conservation of interacting proteins), and in comparisons of transcriptomes or proteomes (phylogenetic proximity and co-regulation/co-expression). Here we review the recent developments aimed at making automatic, reliable phylogeny-based inference feasible in large-scale projects. We also discuss how evolutionary concepts and phylogeny-based inference strategies are now being exploited in order to understand the evolution and function of biological systems. Such advances will be fundamental for the success of the emerging disciplines of systems biology and synthetic biology, and will have wide-reaching effects in applied fields such as biotechnology, medicine and pharmacology
- âŠ