10 research outputs found

    A direct comparison of protein interaction confidence assignment schemes

    Get PDF
    BACKGROUND: Recent technological advances have enabled high-throughput measurements of protein-protein interactions in the cell, producing large protein interaction networks for various species at an ever-growing pace. However, common technologies like yeast two-hybrid may experience high rates of false positive detection. To combat false positive discoveries, a number of different methods have been recently developed that associate confidence scores with protein interactions. Here, we perform a rigorous comparative analysis and performance assessment among these different methods. RESULTS: We measure the extent to which each set of confidence scores correlates with similarity of the interacting proteins in terms of function, expression, pattern of sequence conservation, and homology to interacting proteins in other species. We also employ a new metric, the Signal-to-Noise Ratio of protein complexes embedded in each network, to assess the power of the different methods. Seven confidence assignment schemes, including those of Bader et al., Deane et al., Deng et al., Sharan et al., and Qi et al., are compared in this work. CONCLUSION: Although the performance of each assignment scheme varies depending on the particular metric used for assessment, we observe that Deng et al. yields the best performance overall (in three out of four viable measures). Importantly, we also find that utilizing any of the probability assignment schemes is always more beneficial than assuming all observed interactions to be true or equally likely

    eQED: an efficient method for interpreting eQTL associations using protein networks

    Get PDF
    Analysis of expression quantitative trait loci (eQTLs) is an emerging technique in which individuals are genotyped across a panel of genetic markers and, simultaneously, phenotyped using DNA microarrays. Because of the spacing of markers and linkage disequilibrium, each marker may be near many genes making it difficult to finely map which of these genes are the causal factors responsible for the observed changes in the downstream expression. To address this challenge, we present an efficient method for prioritizing candidate genes at a locus. This approach, called ‘eQTL electrical diagrams' (eQED), integrates eQTLs with protein interaction networks by modeling the two data sets as a wiring diagram of current sources and resistors. eQED achieved a 79% accuracy in recovering a reference set of regulator–target pairs in yeast, which is significantly higher than the performance of three competing methods. eQED also annotates 368 protein–protein interactions with their directionality of information flow with an accuracy of approximately 75%

    Network-Based Elucidation of Human Disease Similarities Reveals Common Functional Modules Enriched for Pluripotent Drug Targets

    Get PDF
    Current work in elucidating relationships between diseases has largely been based on pre-existing knowledge of disease genes. Consequently, these studies are limited in their discovery of new and unknown disease relationships. We present the first quantitative framework to compare and contrast diseases by an integrated analysis of disease-related mRNA expression data and the human protein interaction network. We identified 4,620 functional modules in the human protein network and provided a quantitative metric to record their responses in 54 diseases leading to 138 significant similarities between diseases. Fourteen of the significant disease correlations also shared common drugs, supporting the hypothesis that similar diseases can be treated by the same drugs, allowing us to make predictions for new uses of existing drugs. Finally, we also identified 59 modules that were dysregulated in at least half of the diseases, representing a common disease-state “signature”. These modules were significantly enriched for genes that are known to be drug targets. Interestingly, drugs known to target these genes/proteins are already known to treat significantly more diseases than drugs targeting other genes/proteins, highlighting the importance of these core modules as prime therapeutic opportunities

    Evolutionarily Conserved Herpesviral Protein Interaction Networks

    Get PDF
    Herpesviruses constitute a family of large DNA viruses widely spread in vertebrates and causing a variety of different diseases. They possess dsDNA genomes ranging from 120 to 240 kbp encoding between 70 to 170 open reading frames. We previously reported the protein interaction networks of two herpesviruses, varicella-zoster virus (VZV) and Kaposi's sarcoma-associated herpesvirus (KSHV). In this study, we systematically tested three additional herpesvirus species, herpes simplex virus 1 (HSV-1), murine cytomegalovirus and Epstein-Barr virus, for protein interactions in order to be able to perform a comparative analysis of all three herpesvirus subfamilies. We identified 735 interactions by genome-wide yeast-two-hybrid screens (Y2H), and, together with the interactomes of VZV and KSHV, included a total of 1,007 intraviral protein interactions in the analysis. Whereas a large number of interactions have not been reported previously, we were able to identify a core set of highly conserved protein interactions, like the interaction between HSV-1 UL33 with the nuclear egress proteins UL31/UL34. Interactions were conserved between orthologous proteins despite generally low sequence similarity, suggesting that function may be more conserved than sequence. By combining interactomes of different species we were able to systematically address the low coverage of the Y2H system and to extract biologically relevant interactions which were not evident from single species

    Understanding cellular function through the analysis of protein interaction networks

    No full text
    A major challenge of post-genomic biology is understanding the complex networks of interacting genes, proteins and small molecules that give rise to biological form and function. Advances in whole-genome approaches are now enabling us to characterize these networks systematically, using procedures such as the two-hybrid assay and protein co-immunoprecipitation to screen for protein-protein interactions (PPI). Large protein networks are now available for many species like the baker's yeast, worm, fruit fly and the malaria parasite P. falciparum. These data also introduce a number of technical challenges: how to separate true protein-protein interactions from false positives; how to annotate interactions with functional roles; and, ultimately, how to organize large-scale interaction data into models of cellular signaling and machinery. Further, as protein interactions form the backbone of cellular function, they can potentially be used in conjunction with other large-scale data types to get more insights into the functioning of the cell. In this dissertation, I try to address some the above questions that arise during the analysis of protein networks. First, I describe a new method to assign confidence scores to protein interactions derived from large-scale studies. Subsequently, I perform a benchmarking analysis to compare its performance with other existing methods. Next, I extend the network comparison algorithm, NetworkBLAST, to compare protein networks across multiple species. In particular, to elucidate cellular machinery on a global scale, I performed a multiple comparison of the protein-protein interaction networks of m>C. elegans, D. melanogaster and S. cerevisiae. This comparison integrated protein interaction and sequence information to reveal 71 network regions that were conserved across all three species and many exclusive to the metazoans. I then applied this technique to the analysis of the protein network of the malaria pathogen Plasmodium falciparum and showed that its patterns of interaction, like its genome sequence, set it apart from other species. Finally, I integrated the PPI network data with expression Quantitative Loci (eQTL) data in yeast to efficiently interpret them. I present an efficient method, called 'eQTL Electrical Diagrams' (eQED), that integrates eQTLs with protein interaction networks by modeling the two data sets as a wiring diagram of current sources and resistors. eQED achieved a 79% accuracy in recovering a reference set of regulator-target pairs in yeast, which is significantly higher performance than three competing methods. eQED also annotates 368 protein- protein interactions with their directionality of information flow with an accuracy of approximately 75
    corecore