1,190 research outputs found

    PhylomeDB: a database for genome-wide collections of gene phylogenies

    Get PDF
    The complete collection of evolutionary histories of all genes in a genome, also known as phylome, constitutes a valuable source of information. The reconstruction of phylomes has been previously prevented by large demands of time and computer power, but is now feasible thanks to recent developments in computers and algorithms. To provide a publicly available repository of complete phylomes that allows researchers to access and store large-scale phylogenomic analyses, we have developed PhylomeDB. PhylomeDB is a database of complete phylomes derived for different genomes within a specific taxonomic range. All phylomes in the database are built using a high-quality phylogenetic pipeline that includes evolutionary model testing and alignment trimming phases. For each genome, PhylomeDB provides the alignments, phylogentic trees and tree-based orthology predictions for every single encoded protein. The current version of PhylomeDB includes the phylomes of Human, the yeast Saccharomyces cerevisiae and the bacterium Escherichia coli, comprising a total of 32 289 seed sequences with their corresponding alignments and 172 324 phylogenetic trees. PhylomeDB can be publicly accessed at http://phylomedb.bioinfo.cipf.e

    InParanoid 7: new algorithms and tools for eukaryotic orthology analysis

    Get PDF
    The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site (http://InParanoid.sbc.su.se). Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters

    eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges

    Get PDF
    Orthologous relationships form the basis of most comparative genomic and metagenomic studies and are essential for proper phylogenetic and functional analyses. The third version of the eggNOG database (http://eggnog.embl.de) contains non-supervised orthologous groups constructed from 1133 organisms, doubling the number of genes with orthology assignment compared to eggNOG v2. The new release is the result of a number of improvements and expansions: (i) the underlying homology searches are now based on the SIMAP database; (ii) the orthologous groups have been extended to 41 levels of selected taxonomic ranges enabling much more fine-grained orthology assignments; and (iii) the newly designed web page is considerably faster with more functionality. In total, eggNOG v3 contains 721 801 orthologous groups, encompassing a total of 4 396 591 genes. Additionally, we updated 4873 and 4850 original COGs and KOGs, respectively, to include all 1133 organisms. At the universal level, covering all three domains of life, 101 208 orthologous groups are available, while the others are applicable at 40 more limited taxonomic ranges. Each group is amended by multiple sequence alignments and maximum-likelihood trees and broad functional descriptions are provided for 450 904 orthologous groups (62.5%)

    Cohesive versus Flexible Evolution of Functional Modules in Eukaryotes

    Get PDF
    Although functionally related proteins can be reliably predicted from phylogenetic profiles, many functional modules do not seem to evolve cohesively according to case studies and systematic analyses in prokaryotes. In this study we quantify the extent of evolutionary cohesiveness of functional modules in eukaryotes and probe the biological and methodological factors influencing our estimates. We have collected various datasets of protein complexes and pathways in Saccheromyces cerevisiae. We define orthologous groups on 34 eukaryotic genomes and measure the extent of cohesive evolution of sets of orthologous groups of which members constitute a known complex or pathway. Within this framework it appears that most functional modules evolve flexibly rather than cohesively. Even after correcting for uncertain module definitions and potentially problematic orthologous groups, only 46% of pathways and complexes evolve more cohesively than random modules. This flexibility seems partly coupled to the nature of the functional module because biochemical pathways are generally more cohesively evolving than complexes

    Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation

    Get PDF
    The accelerating growth in the number of protein sequences taxes both the computational and manual resources needed to analyze them. One approach to dealing with this problem is to minimize the number of proteins subjected to such analysis in a way that minimizes loss of information. To this end we have developed a set of Representative Proteomes (RPs), each selected from a Representative Proteome Group (RPG) containing similar proteomes calculated based on co-membership in UniRef50 clusters. A Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold (CMT) are provided to allow users to decrease or increase the granularity of the sequence space based on their requirements. We find that a CMT of 55% (RP55) most closely follows standard taxonomic classifications. Further analysis of this set reveals that sequence space is reduced by more than 80% relative to UniProtKB, while retaining both sequence diversity (over 95% of InterPro domains) and annotation information (93% of experimentally characterized proteins). All sets can be browsed and are available for sequence similarity searches and download at http://www.proteininformationresource.org/rps, while the set of 637 RPs determined using a 55% CMT are also available for text searches. Potential applications include sequence similarity searches, protein classification and targeted protein annotation and characterization

    Single hadron response measurement and calorimeter jet energy scale uncertainty with the ATLAS detector at the LHC

    Get PDF
    The uncertainty on the calorimeter energy response to jets of particles is derived for the ATLAS experiment at the Large Hadron Collider (LHC). First, the calorimeter response to single isolated charged hadrons is measured and compared to the Monte Carlo simulation using proton-proton collisions at centre-of-mass energies of sqrt(s) = 900 GeV and 7 TeV collected during 2009 and 2010. Then, using the decay of K_s and Lambda particles, the calorimeter response to specific types of particles (positively and negatively charged pions, protons, and anti-protons) is measured and compared to the Monte Carlo predictions. Finally, the jet energy scale uncertainty is determined by propagating the response uncertainty for single charged and neutral particles to jets. The response uncertainty is 2-5% for central isolated hadrons and 1-3% for the final calorimeter jet energy scale.Comment: 24 pages plus author list (36 pages total), 23 figures, 1 table, submitted to European Physical Journal

    Measurement of χ c1 and χ c2 production with s√ = 7 TeV pp collisions at ATLAS

    Get PDF
    The prompt and non-prompt production cross-sections for the χ c1 and χ c2 charmonium states are measured in pp collisions at s√ = 7 TeV with the ATLAS detector at the LHC using 4.5 fb−1 of integrated luminosity. The χ c states are reconstructed through the radiative decay χ c → J/ψγ (with J/ψ → μ + μ −) where photons are reconstructed from γ → e + e − conversions. The production rate of the χ c2 state relative to the χ c1 state is measured for prompt and non-prompt χ c as a function of J/ψ transverse momentum. The prompt χ c cross-sections are combined with existing measurements of prompt J/ψ production to derive the fraction of prompt J/ψ produced in feed-down from χ c decays. The fractions of χ c1 and χ c2 produced in b-hadron decays are also measured
    corecore