71 research outputs found

    Retrieving sequences of enzymes experimentally characterized but erroneously annotated : the case of the putrescine carbamoyltransferase

    Get PDF
    BACKGROUND: Annotating genomes remains an hazardous task. Mistakes or gaps in such a complex process may occur when relevant knowledge is ignored, whether lost, forgotten or overlooked. This paper exemplifies an approach which could help to ressucitate such meaningful data. RESULTS: We show that a set of closely related sequences which have been annotated as ornithine carbamoyltransferases are actually putrescine carbamoyltransferases. This demonstration is based on the following points : (i) use of enzymatic data which had been overlooked, (ii) rediscovery of a short NH(2)-terminal sequence allowing to reannotate a wrongly annotated ornithine carbamoyltransferase as a putrescine carbamoyltransferase, (iii) identification of conserved motifs allowing to distinguish unambiguously between the two kinds of carbamoyltransferases, and (iv) comparative study of the gene context of these different sequences. CONCLUSIONS: We explain why this specific case of misannotation had not yet been described and draw attention to the fact that analogous instances must be rather frequent. We urge to be especially cautious when high sequence similarity is coupled with an apparent lack of biochemical information. Moreover, from the point of view of genome annotation, proteins which have been studied experimentally but are not correlated with sequence data in current databases qualify as "orphans", just as unassigned genomic open reading frames do. The strategy we used in this paper to bridge such gaps in knowledge could work whenever it is possible to collect a body of facts about experimental data, homology, unnoticed sequence data, and accurate informations about gene context

    Nutrient sensing modulates malaria parasite virulence

    Get PDF
    The lifestyle of intracellular pathogens, such as malaria parasites, is intimately connected to that of their host, primarily for nutrient supply. Nutrients act not only as primary sources of energy but also as regulators of gene expression, metabolism and growth, through various signalling networks that enable cells to sense and adapt to varying environmental conditions. Canonical nutrient-sensing pathways are presumed to be absent from the causative agent of malaria, Plasmodium, thus raising the question of whether these parasites can sense and cope with fluctuations in host nutrient levels. Here we show that Plasmodium blood-stage parasites actively respond to host dietary calorie alterations through rearrangement of their transcriptome accompanied by substantial adjustment of their multiplication rate. A kinome analysis combined with chemical and genetic approaches identified KIN as a critical regulator that mediates sensing of nutrients and controls a transcriptional response to the host nutritional status. KIN shares homology with SNF1/AMPKα, and yeast complementation studies suggest that it is part of a functionally conserved cellular energy-sensing pathway. Overall, these findings reveal a key parasite nutrient-sensing mechanism that is critical for modulating parasite replication and virulence

    The Multifunctional LigB Adhesin Binds Homeostatic Proteins with Potential Roles in Cutaneous Infection by Pathogenic Leptospira interrogans

    Get PDF
    Leptospirosis is a potentially fatal zoonotic disease in humans and animals caused by pathogenic spirochetes, such as Leptospira interrogans. The mode of transmission is commonly limited to the exposure of mucous membrane or damaged skin to water contaminated by leptospires shed in the urine of carriers, such as rats. Infection occurs during seasonal flooding of impoverished tropical urban habitats with large rat populations, but also during recreational activity in open water, suggesting it is very efficient. LigA and LigB are surface localized proteins in pathogenic Leptospira strains with properties that could facilitate the infection of damaged skin. Their expression is rapidly induced by the increase in osmolarity encountered by leptospires upon transition from water to host. In addition, the immunoglobulin-like repeats of the Lig proteins bind proteins that mediate attachment to host tissue, such as fibronectin, fibrinogen, collagens, laminin, and elastin, some of which are important in cutaneous wound healing and repair. Hemostasis is critical in a fresh injury, where fibrinogen from damaged vasculature mediates coagulation. We show that fibrinogen binding by recombinant LigB inhibits fibrin formation, which could aid leptospiral entry into the circulation, dissemination, and further infection by impairing healing. LigB also binds fibroblast fibronectin and type III collagen, two proteins prevalent in wound repair, thus potentially enhancing leptospiral adhesion to skin openings. LigA or LigB expression by transformation of a nonpathogenic saprophyte, L. biflexa, enhances bacterial adhesion to fibrinogen. Our results suggest that by binding homeostatic proteins found in cutaneous wounds, LigB could facilitate leptospirosis transmission. Both fibronectin and fibrinogen binding have been mapped to an overlapping domain in LigB comprising repeats 9–11, with repeat 11 possibly enhancing binding by a conformational effect. Leptospirosis patient antibodies react with the LigB domain, suggesting applications in diagnosis and vaccines that are currently limited by the strain-specific leptospiral lipopolysaccharide coats

    A critical discussion of the physics of wood–water interactions

    Get PDF

    Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

    Get PDF
    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation

    An expanded evaluation of protein function prediction methods shows an improvement in accuracy

    Get PDF
    Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent
    corecore