247 research outputs found
Classifying pairs with trees for supervised biological network inference
Networks are ubiquitous in biology and computational approaches have been
largely investigated for their inference. In particular, supervised machine
learning methods can be used to complete a partially known network by
integrating various measurements. Two main supervised frameworks have been
proposed: the local approach, which trains a separate model for each network
node, and the global approach, which trains a single model over pairs of nodes.
Here, we systematically investigate, theoretically and empirically, the
exploitation of tree-based ensemble methods in the context of these two
approaches for biological network inference. We first formalize the problem of
network inference as classification of pairs, unifying in the process
homogeneous and bipartite graphs and discussing two main sampling schemes. We
then present the global and the local approaches, extending the later for the
prediction of interactions between two unseen network nodes, and discuss their
specializations to tree-based ensemble methods, highlighting their
interpretability and drawing links with clustering techniques. Extensive
computational experiments are carried out with these methods on various
biological networks that clearly highlight that these methods are competitive
with existing methods.Comment: 22 page
Network-based approaches for linking metabolism with environment
Genome-wide metabolic maps allow the development of network-based computational approaches for linking an organism with its biochemical habitat
Recommended from our members
Exploiting gene deletion fitness effects in yeast to understand the modular architecture of protein complexes under different growth conditions
RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract Background Understanding how individual genes contribute towards the fitness of an organism is a fundamental problem in biology. Although recent genome-wide screens have generated abundant data on quantitative fitness for single gene knockouts, very few studies have systematically integrated other types of biological information to understand how and why deletion of specific genes give rise to a particular fitness effect. In this study, we combine quantitative fitness data for single gene knock-outs in yeast with large-scale interaction discovery experiments to understand the effect of gene deletion on the modular architecture of protein complexes, under different growth conditions. Results Our analysis reveals that genes in complexes show more severe fitness effects upon deletion than other genes but, in contrast to what has been observed in binary protein-protein interaction networks, we find that this is not related to the number of complexes in which they are present. We also find that, in general, the core and attachment components of protein complexes are equally important for the complex machinery to function. However, when quantifying the importance of core and attachments in single complex variations, or isoforms, we observe that this global trend originates from either the core or the attachment components being more important for strain fitness, both being equally important or both being dispensable. Finally, our study reveals that different isoforms of a complex can exhibit distinct fitness patterns across growth conditions. Conclusion This study presents a powerful approach to unveil the molecular basis for various complex phenotypic profiles observed in gene deletion experiments. It also highlights some interesting cases of potential functional compensation between protein paralogues and suggests a new piece to fit into the histone-code puzzle.Published versio
Spial: analysis of subtype-specific features in multiple sequence alignments of proteins
Motivation: Spial (Specificity in alignments) is a tool for the comparative analysis of two alignments of evolutionarily related sequences that differ in their function, such as two receptor subtypes. It highlights functionally important residues that are either specific to one of the two alignments or conserved across both alignments. It permits visualization of this information in three complementary ways: by colour-coding alignment positions, by sequence logos and optionally by colour-coding the residues of a protein structure provided by the user. This can aid in the detection of residues that are involved in the subtype-specific interaction with a ligand, other proteins or nucleic acids. Spial may also be used to detect residues that may be post-translationally modified in one of the two sets of sequences. Availability: http://www.mrc-lmb.cam.ac.uk/genomes/spial/; supplementary information is available at http://www.mrc-lmb.cam.ac.uk/genomes/spial/help.html Contact: [email protected]
p53 shapes genome-wide and cell type-specific changes in microRNA expression during the human DNA damage response.
The human DNA damage response (DDR) triggers profound changes in gene expression, whose nature and regulation remain uncertain. Although certain micro-(mi)RNA species including miR34, miR-18, miR-16 and miR-143 have been implicated in the DDR, there is as yet no comprehensive description of genome-wide changes in the expression of miRNAs triggered by DNA breakage in human cells. We have used next-generation sequencing (NGS), combined with rigorous integrative computational analyses, to describe genome-wide changes in the expression of miRNAs during the human DDR. The changes affect 150 of 1523 miRNAs known in miRBase v18 from 4-24 h after the induction of DNA breakage, in cell-type dependent patterns. The regulatory regions of the most-highly regulated miRNA species are enriched in conserved binding sites for p53. Indeed, genome-wide changes in miRNA expression during the DDR are markedly altered in TP53-/- cells compared to otherwise isogenic controls. The expression levels of certain damage-induced, p53-regulated miRNAs in cancer samples correlate with patient survival. Our work reveals genome-wide and cell type-specific alterations in miRNA expression during the human DDR, which are regulated by the tumor suppressor protein p53. These findings provide a genomic resource to identify new molecules and mechanisms involved in the DDR, and to examine their role in tumor suppression and the clinical outcome of cancer patients
Hierarchy and Feedback in the Evolution of the E. coli Transcription Network
The E.coli transcription network has an essentially feedforward structure,
with, however, abundant feedback at the level of self-regulations. Here, we
investigate how these properties emerged during evolution. An assessment of the
role of gene duplication based on protein domain architecture shows that (i)
transcriptional autoregulators have mostly arisen through duplication, while
(ii) the expected feedback loops stemming from their initial cross-regulation
are strongly selected against. This requires a divergent coevolution of the
transcription factor DNA-binding sites and their respective DNA cis-regulatory
regions. Moreover, we find that the network tends to grow by expansion of the
existing hierarchical layers of computation, rather than by addition of new
layers. We also argue that rewiring of regulatory links due to
mutation/selection of novel transcription factor/DNA binding interactions
appears not to significantly affect the network global hierarchy, and that
horizontally transferred genes are mainly added at the bottom, as new target
nodes. These findings highlight the important evolutionary roles of both
duplication and selective deletion of crosstalks between autoregulators in the
emergence of the hierarchical transcription network of E.coli.Comment: to appear in PNA
Molecular Principles of Gene Fusion Mediated Rewiring of Protein Interaction Networks in Cancer
Gene fusions are common cancer-causing mutations, but the molecular principles by which fusion protein products affect interaction networks and cause disease are not well understood. Here, we perform an integrative analysis of the structural, interactomic, and regulatory properties of thousands of putative fusion proteins. We demonstrate that genes that form fusions (i.e., parent genes) tend to be highly connected hub genes, whose protein products are enriched in structured and disordered interaction-mediating features. Fusion often results in the loss of these parental features and the depletion of regulatory sites such as post-translational modifications. Fusion products disproportionately connect proteins that did not previously interact in the protein interaction network. In this manner, fusion products can escape cellular regulation and constitutively rewire protein interaction networks. We suggest that the deregulation of central, interaction-prone proteins may represent a widespread mechanism by which fusion proteins alter the topology of cellular signaling pathways and promote cancer
Combinatorial multivalent interactions drive cooperative assembly of the COPII coat
Protein secretion is initiated at the endoplasmic reticulum by the COPII coat, which self-assembles to form vesicles. Here, we examine the mechanisms by which a cargo-bound inner coat layer recruits and is organized by an outer scaffolding layer to drive local assembly of a stable structure rigid enough to enforce membrane curvature. An intrinsically disordered region in the outer coat protein, Sec31, drives binding with an inner coat layer via multiple distinct interfaces, including a newly defined charge-based interaction. These interfaces combinatorially reinforce each other, suggesting coat oligomerization is driven by the cumulative effects of multivalent interactions. The Sec31 disordered region could be replaced by evolutionarily distant sequences, suggesting plasticity in the binding interfaces. Such a multimodal assembly platform provides an explanation for how cells build a powerful yet transient scaffold to direct vesicle traffic.</p
- …