4,404 research outputs found

    Protein-protein interactions: network analysis and applications in drug discovery

    Get PDF
    Physical interactions among proteins constitute the backbone of cellular function, making them an attractive source of therapeutic targets. Although the challenges associated with targeting protein-protein interactions (PPIs) -in particular with small molecules are considerable, a growing number of functional PPI modulators is being reported and clinically evaluated. An essential starting point for PPI inhibitor screening or design projects is the generation of a detailed map of the human interactome and the interactions between human and pathogen proteins. Different routes to produce these biological networks are being combined, including literature curation and computational methods. Experimental approaches to map PPIs mainly rely on the yeast two-hybrid (Y2H) technology, which have recently shown to produce reliable protein networks. However, other genetic and biochemical methods will be essential to increase both coverage and resolution of current protein networks in order to increase their utility towards the identification of novel disease-related proteins and PPIs, and their potential use as therapeutic targets

    Bayesian correlated clustering to integrate multiple datasets

    Get PDF
    Motivation: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct – but often complementary – information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured via parameters that describe the agreement among the datasets. Results: Using a set of 6 artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real S. cerevisiae datasets. In the 2-dataset case, we show that MDI’s performance is comparable to the present state of the art. We then move beyond the capabilities of current approaches and integrate gene expression, ChIP-chip and protein-protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques – as well as to non-integrative approaches – demonstrate that MDI is very competitive, while also providing information that would be difficult or impossible to extract using other methods

    Hot-spot analysis for drug discovery targeting protein-protein interactions

    Get PDF
    Introduction: Protein-protein interactions are important for biological processes and pathological situations, and are attractive targets for drug discovery. However, rational drug design targeting protein-protein interactions is still highly challenging. Hot-spot residues are seen as the best option to target such interactions, but their identification requires detailed structural and energetic characterization, which is only available for a tiny fraction of protein interactions. Areas covered: In this review, the authors cover a variety of computational methods that have been reported for the energetic analysis of protein-protein interfaces in search of hot-spots, and the structural modeling of protein-protein complexes by docking. This can help to rationalize the discovery of small-molecule inhibitors of protein-protein interfaces of therapeutic interest. Computational analysis and docking can help to locate the interface, molecular dynamics can be used to find suitable cavities, and hot-spot predictions can focus the search for inhibitors of protein-protein interactions. Expert opinion: A major difficulty for applying rational drug design methods to protein-protein interactions is that in the majority of cases the complex structure is not available. Fortunately, computational docking can complement experimental data. An interesting aspect to explore in the future is the integration of these strategies for targeting PPIs with large-scale mutational analysis.This work has been funded by grants BIO2016-79930-R and SEV-2015-0493 from the Spanish Ministry of Economy, Industry and Competitiveness, and grant EFA086/15 from EU Interreg V POCTEFA. M Rosell is supported by an FPI fellowship from the Severo Ochoa program. The authors are grateful for the support of the the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    Complex-based analysis of dysregulated cellular processes in cancer

    Full text link
    Background: Differential expression analysis of (individual) genes is often used to study their roles in diseases. However, diseases such as cancer are a result of the combined effect of multiple genes. Gene products such as proteins seldom act in isolation, but instead constitute stable multi-protein complexes performing dedicated functions. Therefore, complexes aggregate the effect of individual genes (proteins) and can be used to gain a better understanding of cancer mechanisms. Here, we observe that complexes show considerable changes in their expression, in turn directed by the concerted action of transcription factors (TFs), across cancer conditions. We seek to gain novel insights into cancer mechanisms through a systematic analysis of complexes and their transcriptional regulation. Results: We integrated large-scale protein-interaction (PPI) and gene-expression datasets to identify complexes that exhibit significant changes in their expression across different conditions in cancer. We devised a log-linear model to relate these changes to the differential regulation of complexes by TFs. The application of our model on two case studies involving pancreatic and familial breast tumour conditions revealed: (i) complexes in core cellular processes, especially those responsible for maintaining genome stability and cell proliferation (e.g. DNA damage repair and cell cycle) show considerable changes in expression; (ii) these changes include decrease and countering increase for different sets of complexes indicative of compensatory mechanisms coming into play in tumours; and (iii) TFs work in cooperative and counteractive ways to regulate these mechanisms. Such aberrant complexes and their regulating TFs play vital roles in the initiation and progression of cancer.Comment: 22 pages, BMC Systems Biolog

    TF2Network : predicting transcription factor regulators and gene regulatory networks in Arabidopsis using publicly available binding site information

    Get PDF
    A gene regulatory network (GRN) is a collection of regulatory interactions between transcription factors (TFs) and their target genes. GRNs control different biological processes and have been instrumental to understand the organization and complexity of gene regulation. Although various experimental methods have been used to map GRNs in Arabidop-sis thaliana, their limited throughput combined with the large number of TFs makes that for many genes our knowledge about regulating TFs is incomplete. We introduce TF2Network, a tool that exploits the vast amount of TF binding site information and enables the delineation of GRNs by detecting potential regulators for a set of co-expressed or functionally related genes. Validation using two experimental benchmarks reveals that TF2Network predicts the correct regulator in 75-92% of the test sets. Furthermore, our tool is robust to noise in the input gene sets, has a low false discovery rate, and shows a better performance to recover correct regulators compared to other plant tools. TF2Network is accessible through a web interface where GRNs are interactively visualized and annotated with various types of experimental functional information. TF2Network was used to perform systematic functional and regulatory gene annotations, identifying new TFs involved in circadian rhythm and stress response

    Towards the Identification of Protein Complexes and Functional Modules by Integrating PPI Network and Gene Expression Data

    Get PDF
    Background: Identification of protein complexes and functional modules from protein-protein interaction (PPI) networks is crucial to understanding the principles of cellular organization and predicting protein functions. In the past few years, many computational methods have been proposed. However, most of them considered the PPI networks as static graphs and overlooked the dynamics inherent within these networks. Moreover, few of them can distinguish between protein complexes and functional modules. Results: In this paper, a new framework is proposed to distinguish between protein complexes and functional modules by integrating gene expression data into protein-protein interaction (PPI) data. A series of time-sequenced subnetworks (TSNs) is constructed according to the time that the interactions were activated. The algorithm TSN-PCD was then developed to identify protein complexes from these TSNs. As protein complexes are significantly related to functional modules, a new algorithm DFM-CIN is proposed to discover functional modules based on the identified complexes. The experimental results show that the combination of temporal gene expression data with PPI data contributes to identifying protein complexes more precisely. A quantitative comparison based on f-measure reveals that our algorithm TSN-PCD outperforms the other previous protein complex discovery algorithms. Furthermore, we evaluate the identified functional modules by using “Biological Process” annotated in GO (Gene Ontology). The validation shows that the identified functional modules are statistically significant in terms of “Biological Process”. More importantly, the relationship between protein complexes and functional modules are studied. Conclusions: The proposed framework based on the integration of PPI data and gene expression data makes it possible to identify protein complexes and functional modules more effectively. Moveover, the proposed new framework and algorithms can distinguish between protein complexes and functional modules. Our findings suggest that functional modules are closely related to protein complexes and a functional module may consist of one or multiple protein complexes. The program is available at http://netlab.csu.edu.cn/bioinfomatics/limin/DFM-CIN/index
    corecore