2,269 research outputs found

    Protein complex detection with semi-supervised learning in protein interaction networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interactions (PPIs) play fundamental roles in nearly all biological processes. The systematic analysis of PPI networks can enable a great understanding of cellular organization, processes and function. In this paper, we investigate the problem of protein complex detection from noisy protein interaction data, i.e., finding the subsets of proteins that are closely coupled via protein interactions. However, protein complexes are likely to overlap and the interaction data are very noisy. It is a great challenge to effectively analyze the massive data for biologically meaningful protein complex detection.</p> <p>Results</p> <p>Many people try to solve the problem by using the traditional unsupervised graph clustering methods. Here, we stand from a different point of view, redefining the properties and features for protein complexes and designing a “semi-supervised” method to analyze the problem. In this paper, we utilize the neural network with the “semi-supervised” mechanism to detect the protein complexes. By retraining the neural network model recursively, we could find the optimized parameters for the model, in such a way we can successfully detect the protein complexes. The comparison results show that our algorithm could identify protein complexes that are missed by other methods. We also have shown that our method achieve better precision and recall rates for the identified protein complexes than other existing methods. In addition, the framework we proposed is easy to be extended in the future.</p> <p>Conclusions</p> <p>Using a weighted network to represent the protein interaction network is more appropriate than using a traditional unweighted network. In addition, integrating biological features and topological features to represent protein complexes is more meaningful than using dense subgraphs. Last, the “semi-supervised” learning model is a promising model to detect protein complexes with more biological and topological features available.</p

    Protein kinase CK2: Intricate relationships within regulatory cellular networks

    Get PDF
    © 2017 by the authors. Licensee MDPI, Basel, Switzerland. Protein kinase CK2 is a small family of protein kinases that has been implicated in an expanding array of biological processes. While it is widely accepted that CK2 is a regulatory participant in a multitude of fundamental cellular processes, CK2 is often considered to be a constitutively active enzyme which raises questions about how it can be a regulatory participant in intricately controlled cellular processes. To resolve this apparent paradox, we have performed a systematic analysis of the published literature using text mining as well as mining of proteomic databases together with computational assembly of networks that involve CK2. These analyses reinforce the notion that CK2 is involved in a broad variety of biological processes and also reveal an extensive interplay between CK2 phosphorylation and other post-translational modifications. The interplay between CK2 and other post-translational modifications suggests that CK2 does have intricate roles in orchestrating cellular events. In this respect, phosphorylation of specific substrates by CK2 could be regulated by other post-translational modifications and CK2 could also have roles in modulating other post-translational modifications. Collectively, these observations suggest that the actions of CK2 are precisely coordinated with other constituents of regulatory cellular networks

    On Mining Biological Signals Using Correlation Networks

    Get PDF
    Correlation networks have been used in biological networks to analyze and model high-throughput biological data, such as gene expression from microarray or RNA-seq assays. Typically in biological network modeling, structures can be mined from these networks that represent biological functions; for example, a cluster of proteins in an interactome can represent a protein complex. In correlation networks built from high-throughput gene expression data, it has often been speculated or even assumed that clusters represent sets of genes that are coregulated. This research aims to validate this concept using network systems biology and data mining by identification of correlation network clusters via multiple clustering approaches and cross-validation of regulatory elements in these clusters via motif finding software. The results show that the majority (81- 100%) of genes in any given cluster will share at least one predicted transcription factor binding site. With this in mind, new regulatory relationships can be proposed using known transcription factors and their binding sites by integrating regulatory information and the network model itself

    Network target for screening synergistic drug combinations with application to traditional Chinese medicine

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multicomponent therapeutics offer bright prospects for the control of complex diseases in a synergistic manner. However, finding ways to screen the synergistic combinations from numerous pharmacological agents is still an ongoing challenge.</p> <p>Results</p> <p>In this work, we proposed for the first time a “network target”-based paradigm instead of the traditional "single target"-based paradigm for virtual screening and established an algorithm termed NIMS (Network target-based Identification of Multicomponent Synergy) to prioritize synergistic agent combinations in a high throughput way. NIMS treats a disease-specific biological network as a therapeutic target and assumes that the relationship among agents can be transferred to network interactions among the molecular level entities (targets or responsive gene products) of agents. Then, two parameters in NIMS, Topology Score and Agent Score, are created to evaluate the synergistic relationship between each given agent combinations. Taking the empirical multicomponent system traditional Chinese medicine (TCM) as an illustrative case, we applied NIMS to prioritize synergistic agent pairs from 63 agents on a pathological process instanced by angiogenesis. The NIMS outputs can not only recover five known synergistic agent pairs, but also obtain experimental verification for synergistic candidates combined with, for example, a herbal ingredient Sinomenine, which outperforms the meet/min method. The robustness of NIMS was also showed regarding the background networks, agent genes and topological parameters, respectively. Finally, we characterized the potential mechanisms of multicomponent synergy from a network target perspective.</p> <p>Conclusions</p> <p>NIMS is a first-step computational approach towards identification of synergistic drug combinations at the molecular level. The network target-based approaches may adjust current virtual screen mode and provide a systematic paradigm for facilitating the development of multicomponent therapeutics as well as the modernization of TCM.</p

    ChIP-seq Defined Genome-Wide Map of TGFβ/SMAD4 Targets: Implications with Clinical Outcome of Ovarian Cancer

    Get PDF
    Deregulation of the transforming growth factor-β (TGFβ) signaling pathway in epithelial ovarian cancer has been reported, but the precise mechanism underlying disrupted TGFβ signaling in the disease remains unclear. We performed chromatin immunoprecipitation followed by sequencing (ChIP-seq) to investigate genome-wide screening of TGFβ-induced SMAD4 binding in epithelial ovarian cancer. Following TGFβ stimulation of the A2780 epithelial ovarian cancer cell line, we identified 2,362 SMAD4 binding loci and 318 differentially expressed SMAD4 target genes. Comprehensive examination of SMAD4-bound loci, revealed four distinct binding patterns: 1) Basal; 2) Shift; 3) Stimulated Only; 4) Unstimulated Only. TGFβ stimulated SMAD4-bound loci were primarily classified as either Stimulated only (74%) or Shift (25%), indicating that TGFβ-stimulation alters SMAD4 binding patterns in epithelial ovarian cancer cells. Furthermore, based on gene regulatory network analysis, we determined that the TGFβ-induced, SMAD4-dependent regulatory network was strikingly different in ovarian cancer compared to normal cells. Importantly, the TGFβ/SMAD4 target genes identified in the A2780 epithelial ovarian cancer cell line were predictive of patient survival, based on in silico mining of publically available patient data bases. In conclusion, our data highlight the utility of next generation sequencing technology to identify genome-wide SMAD4 target genes in epithelial ovarian cancer and link aberrant TGFβ/SMAD signaling to ovarian tumorigenesis. Furthermore, the identified SMAD4 binding loci, combined with gene expression profiling and in silico data mining of patient cohorts, may provide a powerful approach to determine potential gene signatures with biological and future translational research in ovarian and other cancers

    A Parallel Template for Implementing Filters for Biological Correlation Networks

    Get PDF
    High throughput biological experiments are critical for their role in systems biology – the ability to survey the state of cellular mechanisms on the broad scale opens possibilities for the scientific researcher to understand how multiple components come together, and what goes wrong in disease states. However, the data returned from these experiments is massive and heterogeneous, and requires intuitive and clever computational algorithms for analysis. The correlation network model has been proposed as a tool for modeling and analysis of this high throughput data; structures within the model identified by graph theory have been found to represent key players in major cellular pathways. Previous work has found that network filtering using graph theoretic structural concepts can reduce noise and strengthen biological signals in these networks. However, the process of filtering biological network using such filters is computationally intensive and the filtered networks remain large. In this research, we develop a parallel template for these network filters to improve runtime, and use this high performance environment to show that parallelization does not affect network structure or biological function of that structure

    Analysis of Protein-Protein Interaction Networks Using High Performance Scalable Tools

    Get PDF
    Protein-Protein Interaction (PPI) Research currently generates an extraordinary amount of publications and interest in fellow computer scientists and biologists alike because of the underlying potential of the source material that researchers can work with. PPI networks are the networks of protein complexes formed by biochemical events or electrostatic forces serving a biological function [1]. Since the analysis of the protein networks is now growing, we have more information regarding protein, genomes and their influence on life. Today, PPI networks are used to study diseases, improve drugs and understand other processes in medicine and health that will eventually help mankind. Though PPI network research is considered extremely important in the field, there is an issue – we do not have enough people who have enough interdisciplinary knowledge in both the fields of biology and computer science; this limits our rate of progress in the field. Most biologists that are not expert coders need a way of calculating graph values and information that will help them analyze the graphs better without having to manipulate the data themselves. In this research, I test a few ways of achieving results through the use of available frameworks and algorithms, present the results and compare each method’s efficacy. My analysis takes place on very large datasets where I calculate several centralities and other data from the graph using different metrics, and I also visualize them in order to gain further insight. I also managed to note the significance of MPI and multithreading on the results thus obtained that suggest building scalable tools will help improve the analysis immensely

    Clique-based data mining for related genes in a biomedical database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph.</p> <p>Results</p> <p>We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes.</p> <p>Conclusion</p> <p>We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.</p
    corecore