45 research outputs found
Selectome: a database of positive selection
Genome wide scans have shown that positive selection is relatively frequent at the molecular level. It is of special interest to identify which protein sites and which phylogenetic branches are affected. We present Selectome, a database which provides the results of a rigorous branch-site specific likelihood test for positive selection. The Web interface presents test results mapped both onto phylogenetic trees and onto protein alignments. It allows rapid access to results by keyword, gene name, or taxonomy based queries. Selectome is freely available at http://bioinfo.unil.ch/selectom
Selectome update: quality control and computational improvements to a database of positive selection
Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsin
CATH: comprehensive structural and functional annotations for genome sequences.
The latest version of the CATH-Gene3D protein structure classification database (4.0, http://www.cathdb.info) provides annotations for over 235,000 protein domain structures and includes 25 million domain predictions. This article provides an update on the major developments in the 2 years since the last publication in this journal including: significant improvements to the predictive power of our functional families (FunFams); the release of our 'current' putative domain assignments (CATH-B); a new, strictly non-redundant data set of CATH domains suitable for homology benchmarking experiments (CATH-40) and a number of improvements to the web pages
New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.
CATH version 3.5 (Class, Architecture, Topology, Homology, available at http://www.cathdb.info/) contains 173 536 domains, 2626 homologous superfamilies and 1313 fold groups. When focusing on structural genomics (SG) structures, we observe that the number of new folds for CATH v3.5 is slightly less than for previous releases, and this observation suggests that we may now know the majority of folds that are easily accessible to structure determination. We have improved the accuracy of our functional family (FunFams) sub-classification method and the CATH sequence domain search facility has been extended to provide FunFam annotations for each domain. The CATH website has been redesigned. We have improved the display of functional data and of conserved sequence features associated with FunFams within each CATH superfamily
Vitellogenin Underwent Subfunctionalization to Acquire Caste and Behavioral Specific Expression in the Harvester Ant Pogonomyrmex barbatus
PMCID: PMC3744404This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication
Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs
The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the âortholog conjectureâ, or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins
Selective deployment of transcription factor paralogs with submaximal strength facilitates gene regulation in the immune system
In multicellular organisms, duplicated genes can diverge through tissue-specific gene expression patterns, as exemplified by highly regulated expression of Runx transcription factor paralogs with apparent functional redundancy. Here we asked what cell type-specific biologies might be supported by the selective expression of Runx paralogs during Langerhans cell and inducible regulatory T cell differentiation. We uncovered functional non-equivalence between Runx paralogs. Selective expression of native paralogs allowed integration of transcription factor activity with extrinsic signals, while non-native paralogs enforced differentiation even in the absence of exogenous inducers. DNA-binding affinity was controlled by divergent amino acids within the otherwise highly conserved RUNT domain, and evolutionary reconstruction suggested convergence of RUNT domain residues towards sub-maximal strength. Hence, the selective expression of gene duplicates in specialized cell types can synergize with the acquisition of functional differences to enable appropriate gene expression, lineage choice and differentiation in the mammalian immune system
The era of reference genomes in conservation genomics
Progress in genome sequencing
now enables the large-scale
generation of reference genomes.
Various international initiatives
aim to generate reference genomes
representing global biodiversity.
These genomes provide
unique insights into genomic diversity
and architecture, thereby enabling
comprehensive analyses
of population and functional
genomics, and are expected
to revolutionize conservation
genomics
The era of reference genomes in conservation genomics
Progress in genome sequencing now enables the large-scale generation of reference genomes. Various international initiatives aim to generate reference genomes representing global biodiversity. These genomes provide unique insights into genomic diversity and architecture, thereby enabling comprehensive analyses of population and functional
genomics, and are expected to revolutionize conservation genomics