2,342 research outputs found
Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes
Complexes of physically interacting proteins constitute fundamental
functional units responsible for driving biological processes within cells. A
faithful reconstruction of the entire set of complexes is therefore essential
to understand the functional organization of cells. In this review, we discuss
the key contributions of computational methods developed till date
(approximately between 2003 and 2015) for identifying complexes from the
network of interacting proteins (PPI network). We evaluate in depth the
performance of these methods on PPI datasets from yeast, and highlight
challenges faced by these methods, in particular detection of sparse and small
or sub- complexes and discerning of overlapping complexes. We describe methods
for integrating diverse information including expression profiles and 3D
structures of proteins with PPI networks to understand the dynamics of complex
formation, for instance, of time-based assembly of complex subunits and
formation of fuzzy complexes from intrinsically disordered proteins. Finally,
we discuss methods for identifying dysfunctional complexes in human diseases,
an application that is proving invaluable to understand disease mechanisms and
to discover novel therapeutic targets. We hope this review aptly commemorates a
decade of research on computational prediction of complexes and constitutes a
valuable reference for further advancements in this exciting area.Comment: 1 Tabl
Recommended from our members
Mapping genetic interactions in cancer: a road to rational combination therapies.
The discovery of synthetic lethal interactions between poly (ADP-ribose) polymerase (PARP) inhibitors and BRCA genes, which are involved in homologous recombination, led to the approval of PARP inhibition as a monotherapy for patients with BRCA1/2-mutated breast or ovarian cancer. Studies following the initial observation of synthetic lethality demonstrated that the reach of PARP inhibitors is well beyond just BRCA1/2 mutants. Insights into the mechanisms of action of anticancer drugs are fundamental for the development of targeted monotherapies or rational combination treatments that will synergize to promote cancer cell death and overcome mechanisms of resistance. The development of targeted therapeutic agents is premised on mapping the physical and functional dependencies of mutated genes in cancer. An important part of this effort is the systematic screening of genetic interactions in a variety of cancer types. Until recently, genetic-interaction screens have relied either on the pairwise perturbations of two genes or on the perturbation of genes of interest combined with inhibition by commonly used anticancer drugs. Here, we summarize recent advances in mapping genetic interactions using targeted, genome-wide, and high-throughput genetic screens, and we discuss the therapeutic insights obtained through such screens. We further focus on factors that should be considered in order to develop a robust analysis pipeline. Finally, we discuss the integration of functional interaction data with orthogonal methods and suggest that such approaches will increase the reach of genetic-interaction screens for the development of rational combination therapies
Network-based approaches to explore complex biological systems towards network medicine
Network medicine relies on different types of networks: from the molecular level of proteinâprotein interactions to gene regulatory network and correlation studies of gene expression. Among network approaches based on the analysis of the topological properties of proteinâprotein interaction (PPI) networks, we discuss the widespread DIAMOnD (disease module detection) algorithm. Starting from the assumption that PPI networks can be viewed as maps where diseases can be identified with localized perturbation within a specific neighborhood (i.e., disease modules), DIAMOnD performs a systematic analysis of the human PPI network to uncover new disease-associated genes by exploiting the connectivity significance instead of connection density. The past few years have witnessed the increasing interest in understanding the molecular mechanism of post-transcriptional regulation with a special emphasis on non-coding RNAs since they are emerging as key regulators of many cellular processes in both physiological and pathological states. Recent findings show that coding genes are not the only targets that microRNAs interact with. In fact, there is a pool of different RNAsâincluding long non-coding RNAs (lncRNAs) âcompeting with each other to attract microRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The framework of regulatory networks provides a powerful tool to gather new insights into ceRNA regulatory mechanisms. Here, we describe a data-driven model recently developed to explore the lncRNA-associated ceRNA activity in breast invasive carcinoma. On the other hand, a very promising example of the co-expression network is the one implemented by the software SWIM (switch miner), which combines topological properties of correlation networks with gene expression data in order to identify a small pool of genesâcalled switch genesâcritically associated with drastic changes in cell phenotype. Here, we describe SWIM tool along with its applications to cancer research and compare its predictions with DIAMOnD disease genes
Complex-based analysis of dysregulated cellular processes in cancer
Background: Differential expression analysis of (individual) genes is often
used to study their roles in diseases. However, diseases such as cancer are a
result of the combined effect of multiple genes. Gene products such as proteins
seldom act in isolation, but instead constitute stable multi-protein complexes
performing dedicated functions. Therefore, complexes aggregate the effect of
individual genes (proteins) and can be used to gain a better understanding of
cancer mechanisms. Here, we observe that complexes show considerable changes in
their expression, in turn directed by the concerted action of transcription
factors (TFs), across cancer conditions. We seek to gain novel insights into
cancer mechanisms through a systematic analysis of complexes and their
transcriptional regulation.
Results: We integrated large-scale protein-interaction (PPI) and
gene-expression datasets to identify complexes that exhibit significant changes
in their expression across different conditions in cancer. We devised a
log-linear model to relate these changes to the differential regulation of
complexes by TFs. The application of our model on two case studies involving
pancreatic and familial breast tumour conditions revealed: (i) complexes in
core cellular processes, especially those responsible for maintaining genome
stability and cell proliferation (e.g. DNA damage repair and cell cycle) show
considerable changes in expression; (ii) these changes include decrease and
countering increase for different sets of complexes indicative of compensatory
mechanisms coming into play in tumours; and (iii) TFs work in cooperative and
counteractive ways to regulate these mechanisms. Such aberrant complexes and
their regulating TFs play vital roles in the initiation and progression of
cancer.Comment: 22 pages, BMC Systems Biolog
Automated design of bacterial genome sequences
Background:
Organisms have evolved ways of regulating transcription to better adapt to varying environments. Could the current functional genomics data and models support the possibility of engineering a genome with completely rearranged gene organization while the cell maintains its behavior under environmental challenges? How would we proceed to design a full nucleotide sequence for such genomes?
Results:
As a first step towards answering such questions, recent work showed that it is possible to design alternative transcriptomic models showing the same behavior under environmental variations than the wild-type model. A second step would require providing evidence that it is possible to provide a nucleotide sequence for a genome encoding such transcriptional model. We used computational design techniques to design a rewired global transcriptional regulation of Escherichia coli, yet showing a similar transcriptomic response than the wild-type. Afterwards, we âcompiledâ the transcriptional networks into nucleotide sequences to obtain the final genome sequence. Our computational evolution procedure ensures that we can maintain the genotype-phenotype mapping during the rewiring of the regulatory network. We found that it is theoretically possible to reorganize E. coli genome into 86% fewer regulated operons. Such refactored genomes are constituted by operons that contain sets of genes sharing around the 60% of their biological functions and, if evolved under highly variable environmental conditions, have regulatory networks, which turn out to respond more than 20% faster to multiple external perturbations.
Conclusions:
This work provides the first algorithm for producing a genome sequence encoding a rewired transcriptional regulation with wild-type behavior under alternative environments
Network analyses of proteome evolution and diversity
The mapping of biomolecular interactions reveals that the function of most biological components depends on a web of interrelations with other cellular components, stressing the need for a systems-level view of biological functions. In this work, I explore ways in which the integration of network and genomic information from different organizational levels can lead to a better understanding of cellular systems and components. First, studying yeast, I show that the evolutionary properties of target genes constitute the dominant determinant of transcription factor (TF) evolutionary rate and that this evolutionary modularity is limited to activating regulatory relationships. I also show that targets of fast-evolving TFs show greater evolutionary expression
changes and are enriched for niche-specific functions and other TFs. This work highlights the importance of trans-regulatory network evolution in species-specific gene expression and network adaptation.
Next, I show that genes either lost or gained across fungal evolution are enriched in TFs and have very different network and genomic properties than universally conserved genes, including, in sharp contrast to other networks, a greater number of transcriptional regulators. Placing genes in the context of their evolutionary life-cycle reveals principles of network integration of gained genes and evidence for the progressive network and functional marginalization of genes as an evolutionary process preceding gene loss.
In the final chapter, I study how alternative splicing (AS)-driven expansion of human proteome diversity leads to system-level complexity through the AS-mediated rewiring of the protein-protein interaction network. By overlaying different network and genomic datasets onto the first large-scale isoform-resolution interactome, I found that differentiating between splice variants is essential to capturing the full extent of the network's functional modularity. I also discovered that AS-mediated rewiring preferentially affects tissue-specific genes and that topologically different
patterns of rewiring have distinct functional consequences. Furthermore, I found that most rewiring can be traced to the AS of evolutionarily conserved sequence modules, which promote or block interactions and tend to overlap linear motifs and disrupt known domain-domain interactions.
Together, this work demonstrates that a network-level perspective and genomic data integration are essential to understanding the evolution and functional diversity of proteomes
Identifying disease-associated genes based on artificial intelligence
Identifying disease-gene associations can help improve the understanding of disease mechanisms, which has a variety of applications, such as early diagnosis and drug development. Although experimental techniques, such as linkage analysis, genome-wide association studies (GWAS), have identified a large number of associations, identifying disease genes is still challenging since experimental methods are usually time-consuming and expensive. To solve these issues, computational methods are proposed to predict disease-gene associations.
Based on the characteristics of existing computational algorithms in the literature, we can roughly divide them into three categories: network-based methods, machine learning-based methods, and other methods. No matter what models are used to predict disease genes, the proper integration of multi-level biological data is the key to improving prediction accuracy. This thesis addresses some limitations of the existing computational algorithms, and integrates multi-level data via artificial intelligence techniques. The thesis starts with a comprehensive review of computational methods, databases, and evaluation methods used in predicting disease-gene associations, followed by one network-based method and four machine learning-based methods.
The first chapter introduces the background information, objectives of the studies and structure of the thesis. After that, a comprehensive review is provided in the second chapter to discuss the existing algorithms as well as the databases and evaluation methods used in existing studies. Having the objectives and future directions, the thesis then presents five computational methods for predicting disease-gene associations.
The first method proposed in Chapter 3 considers the issue of non-disease gene selection. A shortest path-based strategy is used to select reliable non-disease genes from a disease gene network and a differential network. The selected genes are then used by a network-energy model to improve its performance. The second method proposed in Chapter 4 constructs sample-based networks for case samples and uses them to predict disease genes. This strategy improves the quality of protein-protein interaction (PPI) networks, which further improves the prediction accuracy. Chapter 5 presents a generic model which applies multimodal deep belief nets (DBN) to fuse different types of data. Network embeddings extracted from PPI networks and gene ontology (GO) data are fused with the multimodal DBN to obtain cross-modality representations. Chapter 6 presents another deep learning model which uses a convolutional neural network (CNN) to integrate gene similarities with other types of data. Finally, the fifth method proposed in Chapter 7 is a nonnegative matrix factorization (NMF)-based method. This method maps diseases and genes onto a lower-dimensional manifold, and the geodesic distance between diseases and genes are used to predict their associations. The method can predict disease genes even if the disease under consideration has no known associated genes.
In summary, this thesis has proposed several artificial intelligence-based computational algorithms to address the typical issues existing in computational algorithms. Experimental results have shown that the proposed methods can improve the accuracy of disease-gene prediction
- âŚ