264 research outputs found

    Expansion of the human mitochondrial proteome by intra- and inter-compartmental protein duplication

    Get PDF
    The human mitochondrial proteome is shown to have expanded due to duplication of protein encoding genes and re-localization of these duplicated proteins

    From Endosymbiont to Host-Controlled Organelle: The Hijacking of Mitochondrial Protein Synthesis and Metabolism

    Get PDF
    Mitochondria are eukaryotic organelles that originated from the endosymbiosis of an alpha-proteobacterium. To gain insight into the evolution of the mitochondrial proteome as it proceeded through the transition from a free-living cell to a specialized organelle, we compared a reconstructed ancestral proteome of the mitochondrion with the proteomes of alpha-proteobacteria as well as with the mitochondrial proteomes in yeast and man. Overall, there has been a large turnover of the mitochondrial proteome during the evolution of mitochondria. Early in the evolution of the mitochondrion, proteins involved in cell envelope synthesis have virtually disappeared, whereas proteins involved in replication, transcription, cell division, transport, regulation, and signal transduction have been replaced by eukaryotic proteins. More than half of what remains from the mitochondrial ancestor in modern mitochondria corresponds to translation, including post-translational modifications, and to metabolic pathways that are directly, or indirectly, involved in energy conversion. Altogether, the results indicate that the eukaryotic host has hijacked the proto-mitochondrion, taking control of its protein synthesis and metabolism

    Complex fate of paralogs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Thanks to recent high coverage mass-spectrometry studies and reconstructed protein complexes, we are now in an unprecedented position to study the evolution of biological systems. Gene duplications, known to be a major source of innovation in evolution, can now be readily examined in the context of protein complexes.</p> <p>Results</p> <p>We observe that paralogs operating in the same complex fulfill different roles: mRNA dosage increase for more than a hundred cytosolic ribosomal proteins, mutually exclusive participation of at least 54 paralogs resulting in alternative forms of complexes, and 24 proteins contributing to <it>bona fide </it>structural growth. Inspection of paralogous proteins participating in two independent complexes shows that an ancient, pre-duplication protein functioned in both multi-protein assemblies and a gene duplication event allowed the respective copies to specialize and split their roles.</p> <p>Conclusion</p> <p>Variants with conditionally assembled, paralogous subunits likely have played a role in yeast's adaptation to anaerobic conditions. In a number of cases the gene duplication has given rise to one duplicate that is no longer part of a protein complex and shows an accelerated rate of evolution. Such genes could provide the raw material for the evolution of new functions.</p

    A global definition of expression context is conserved between orthologs, but does not correlate with sequence conservation

    Get PDF
    BACKGROUND: The massive scale of microarray derived gene expression data allows for a global view of cellular function. Thus far, comparative studies of gene expression between species have been based on the level of expression of the gene across corresponding tissues, or on the co-expression of the gene with another gene. RESULTS: To compare gene expression between distant species on a global scale, we introduce the "expression context". The expression context of a gene is based on the co-expression with all other genes that have unambiguous counterparts in both genomes. Employing this new measure, we show 1) that the expression context is largely conserved between orthologs, and 2) that sequence identity shows little correlation with expression context conservation after gene duplication and speciation. CONCLUSION: This means that the degree of sequence identity has a limited predictive quality for differential expression context conservation between orthologs, and thus presumably also for other facets of gene function

    Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly

    Get PDF
    Motivation: Most microbial species can not be cultured in the laboratory. Metagenomic sequencing may still yield a complete genome if the sequenced community is enriched and the sequencing coverage is high. However, the complexity in a natural population may cause the enrichment culture to contain multiple related strains. This diversity can confound existing strict assembly programs and lead to a fragmented assembly, which is unnecessary if we have a related reference genome available that can function as a scaffold

    Conserved co-expression for candidate disease gene prioritization

    Get PDF
    Contains fulltext : 71114.pdf ( ) (Open Access)BACKGROUND: Genes that are co-expressed tend to be involved in the same biological process. However, co-expression is not a very reliable predictor of functional links between genes. The evolutionary conservation of co-expression between species can be used to predict protein function more reliably than co-expression in a single species. Here we examine whether co-expression across multiple species is also a better prioritizer of disease genes than is co-expression between human genes alone. RESULTS: We use co-expression data from yeast (S. cerevisiae), nematode worm (C. elegans), fruit fly (D. melanogaster), mouse and human and find that the use of evolutionary conservation can indeed improve the predictive value of co-expression. The effect that genes causing the same disease have higher co-expression than do other genes from their associated disease loci, is significantly enhanced when co-expression data are combined across evolutionarily distant species. We also find that performance can vary significantly depending on the co-expression datasets used, and just using more data does not necessarily lead to better prioritization. Instead, we find that dataset quality is more important than quantity, and using a consistent microarray platform per species leads to better performance than using more inclusive datasets pooled from various platforms. CONCLUSION: We find that evolutionarily conserved gene co-expression prioritizes disease candidate genes better than human gene co-expression alone, and provide the integrated data as a new resource for disease gene prioritization tools.13 p

    Benchmarking ortholog identification methods using functional genomics data

    Get PDF
    BACKGROUND: The transfer of functional annotations from model organism proteins to human proteins is one of the main applications of comparative genomics. Various methods are used to analyze cross-species orthologous relationships according to an operational definition of orthology. Often the definition of orthology is incorrectly interpreted as a prediction of proteins that are functionally equivalent across species, while in fact it only defines the existence of a common ancestor for a gene in different species. However, it has been demonstrated that orthologs often reveal significant functional similarity. Therefore, the quality of the orthology prediction is an important factor in the transfer of functional annotations (and other related information). To identify protein pairs with the highest possible functional similarity, it is important to qualify ortholog identification methods. RESULTS: To measure the similarity in function of proteins from different species we used functional genomics data, such as expression data and protein interaction data. We tested several of the most popular ortholog identification methods. In general, we observed a sensitivity/selectivity trade-off: the functional similarity scores per orthologous pair of sequences become higher when the number of proteins included in the ortholog groups decreases. CONCLUSION: By combining the sensitivity and the selectivity into an overall score, we show that the InParanoid program is the best ortholog identification method in terms of identifying functionally equivalent proteins

    Signature, a web server for taxonomic characterization of sequence samples using signature genes

    Get PDF
    Signature genes are genes that are unique to a taxonomic clade and are common within it. They contain a wealth of information about clade-specific processes and hold a strong evolutionary signal that can be used to phylogenetically characterize a set of sequences, such as a metagenomics sample. As signature genes are based on gene content, they provide a means to assess the taxonomic origin of a sequence sample that is complementary to sequence-based analyses. Here, we introduce Signature (http://www.cmbi.ru.nl/signature), a web server that identifies the signature genes in a set of query sequences, and therewith phylogenetically characterizes it. The server produces a list of taxonomic clades that share signature genes with the set of query sequences, along with an insightful image of the tree of life, in which the clades are color coded based on the number of signature genes present. This allows the user to quickly see from which part(s) of the taxonomy the query sequences likely originate

    Asymmetric relationships between proteins shape genome evolution

    Get PDF
    An investigation of metabolic networks in E. coli and S. cerevisiae reveals that asymmetric protein interactions affect gene expression, the relative effect of gene-knockouts and genome evolution

    Correlation between sequence conservation and the genomic context after gene duplication

    Get PDF
    A key complication in comparative genomics for reliable gene function prediction is the existence of duplicated genes. To study the effect of gene duplication on function prediction, we analyze orthologs between pairs of genomes where in one genome the orthologous gene has duplicated after the speciation of the two genomes (i.e. inparalogs). For these duplicated genes we investigate whether the gene that is most similar on the sequence level is also the gene that has retained the ancestral gene-neighborhood. Although the majority of investigated cases show a consistent pattern between sequence similarity and gene-neighborhood conservation, a substantial fraction, 29–38%, is inconsistent. The observation of inconsistency is not the result of a chance outcome owing to a lack of divergence time between inparalogs, but rather it seems to be the result of a chance outcome caused by very similar rates of sequence evolution of both inparalogs relative to their ortholog. If one-to-one orthologous relationships are required, it is advisable to combine contextual information (i.e. gene-neighborhood in prokaryotes and co-expression in eukaryotes) with protein sequence information to predict the most probable functional equivalent ortholog in the presence of inparalogs
    corecore