53 research outputs found

    Inference of Functional Relations in Predicted Protein Networks with a Machine Learning Approach

    Get PDF
    Background: Molecular biology is currently facing the challenging task of functionally characterizing the proteome. The large number of possible protein-protein interactions and complexes, the variety of environmental conditions and cellular states in which these interactions can be reorganized, and the multiple ways in which a protein can influence the function of others, requires the development of experimental and computational approaches to analyze and predict functional associations between proteins as part of their activity in the interactome. Methodology/Principal Findings: We have studied the possibility of constructing a classifier in order to combine the output of the several protein interaction prediction methods. The AODE (Averaged One-Dependence Estimators) machine learning algorithm is a suitable choice in this case and it provides better results than the individual prediction methods, and it has better performances than other tested alternative methods in this experimental set up. To illustrate the potential use of this new AODE-based Predictor of Protein InterActions (APPIA), when analyzing high-throughput experimental data, we show how it helps to filter the results of published High-Throughput proteomic studies, ranking in a significant way functionally related pairs. Availability: All the predictions of the individual methods and of the combined APPIA predictor, together with the used datasets of functional associations are available at http://ecid.bioinfo.cnio.es/. Conclusions: We propose a strategy that integrates the main current computational techniques used to predict functional associations into a unified classifier system, specifically focusing on the evaluation of poorly characterized protein pairs. We selected the AODE classifier as the appropriate tool to perform this task. AODE is particularly useful to extract valuable information from large unbalanced and heterogeneous data sets. The combination of the information provided by five prediction interaction prediction methods with some simple sequence features in APPIA is useful in establishing reliability values and helpful to prioritize functional interactions that can be further experimentally characterized.This work was funded by the BioSapiens (grant number LSHG-CT-2003-503265) and the Experimental Network for Functional Integration (ENFIN) Networks of Excellence (contract number LSHG-CT-2005-518254), by Consolider BSC (grant number CSD2007-00050) and by the project “Functions for gene sets” from the Spanish Ministry of Education and Science (BIO2007-66855). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies

    Get PDF
    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%–63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with “overprediction” of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation

    Klebsiella pneumoniae Multiresistance Plasmid pMET1: Similarity with the Yersinia pestis Plasmid pCRY and Integrative Conjugative Elements

    Get PDF
    Dissemination of antimicrobial resistance genes has become an important public health and biodefense threat. Plasmids are important contributors to the rapid acquisition of antibiotic resistance by pathogenic bacteria.The nucleotide sequence of the Klebsiella pneumoniae multiresistance plasmid pMET1 comprises 41,723 bp and includes Tn1331.2, a transposon that carries the bla(TEM-1) gene and a perfect duplication of a 3-kbp region including the aac(6')-Ib, aadA1, and bla(OXA-9) genes. The replication region of pMET1 has been identified. Replication is independent of DNA polymerase I, and the replication region is highly related to that of the cryptic Yersinia pestis 91001 plasmid pCRY. The potential partition region has the general organization known as the parFG locus. The self-transmissible pMET1 plasmid includes a type IV secretion system consisting of proteins that make up the mating pair formation complex (Mpf) and the DNA transfer (Dtr) system. The Mpf is highly related to those in the plasmid pCRY, the mobilizable high-pathogenicity island from E. coli ECOR31 (HPI(ECOR31)), which has been proposed to be an integrative conjugative element (ICE) progenitor of high-pathogenicity islands in other Enterobacteriaceae including Yersinia species, and ICE(Kp1), an ICE found in a K. pneumoniae strain causing primary liver abscess. The Dtr MobB and MobC proteins are highly related to those of pCRY, but the endonuclease is related to that of plasmid pK245 and has no significant homology with the protein of similar function in pCRY. The region upstream of mobB includes the putative oriT and shares 90% identity with the same region in the HPI(ECOR31).The comparative analyses of pMET1 with pCRY, HPI(ECOR31), and ICE(Kp1 )show a very active rate of genetic exchanges between Enterobacteriaceae including Yersinia species, which represents a high public health and biodefense threat due to transfer of multiple resistance genes to pathogenic Yersinia strains

    Molecular control of HIV-1 postintegration latency: implications for the development of new therapeutic strategies

    Get PDF
    The persistence of HIV-1 latent reservoirs represents a major barrier to virus eradication in infected patients under HAART since interruption of the treatment inevitably leads to a rebound of plasma viremia. Latency establishes early after infection notably (but not only) in resting memory CD4+ T cells and involves numerous host and viral trans-acting proteins, as well as processes such as transcriptional interference, RNA silencing, epigenetic modifications and chromatin organization. In order to eliminate latent reservoirs, new strategies are envisaged and consist of reactivating HIV-1 transcription in latently-infected cells, while maintaining HAART in order to prevent de novo infection. The difficulty lies in the fact that a single residual latently-infected cell can in theory rekindle the infection. Here, we review our current understanding of the molecular mechanisms involved in the establishment and maintenance of HIV-1 latency and in the transcriptional reactivation from latency. We highlight the potential of new therapeutic strategies based on this understanding of latency. Combinations of various compounds used simultaneously allow for the targeting of transcriptional repression at multiple levels and can facilitate the escape from latency and the clearance of viral reservoirs. We describe the current advantages and limitations of immune T-cell activators, inducers of the NF-κB signaling pathway, and inhibitors of deacetylases and histone- and DNA- methyltransferases, used alone or in combinations. While a solution will not be achieved by tomorrow, the battle against HIV-1 latent reservoirs is well- underway
    corecore