151 research outputs found

    STRING and STITCH: known and predicted interactions between proteins and chemicals

    Get PDF
    Information on protein-protein and protein-chemical interactions is essential for understanding cellular functions. The STRING and STITCH web resources integrate interaction evidence derived from pathways, automatic literature mining, primary experimental data, and genomic context. The resulting interaction networks cover 1.5 million proteins from 373 organisms and 68,000 chemicals

    AssociationViewer: a scalable and integrated software tool for visualization of large-scale variation data in genomic context

    Get PDF
    Summary: We present a tool designed for visualization of large-scale genetic and genomic data exemplified by results from genome-wide association studies. This software provides an integrated framework to facilitate the interpretation of SNP association studies in genomic context. Gene annotations can be retrieved from Ensembl, linkage disequilibrium data downloaded from HapMap and custom data imported in BED or WIG format. AssociationViewer integrates functionalities that enable the aggregation or intersection of data tracks. It implements an efficient cache system and allows the display of several, very large-scale genomic datasets

    _M. tuberculosis_ interactome analysis unravels potential pathways to drug resistance

    Get PDF
    Drug resistance is a major problem for combating tuberculosis. Lack of understanding of how resistance emerges in bacteria upon drug treatment limits our ability to counter resistance. By analysis of the _Mycobacterium tuberculosis_ interactome network, along with drug-induced expression data from literature, we show possible pathways for the emergence of drug resistance. To a curated set of resistance related proteins, we have identified sets of high propensity paths from different drug targets. Many top paths were upregulated upon exposure to anti-tubercular drugs. Different targets appear to have different propensities for the four resistance mechanisms. Knowledge of important proteins in such pathways enables identification of appropriate _'co-targets'_, which when simultaneously inhibited with the intended target, is likely to help in combating drug resistance. RecA, Rv0823c, Rv0892 and DnaE1 were the best examples of co-targets for combating tuberculosis. This approach is also inherently generic, likely to significantly impact drug discovery

    Evolution signatures in genome network properties

    Get PDF
    Genomes maybe organized as networks where protein-protein association plays the role of network links. The resulting networks are far from being random and their topological properties are a consequence of the underlying mechanisms for genome evolution. Considering data on protein-protein association networks from STRING database, we present experimental evidence that degree distribution is not scale free, presenting an increased probability for high degree nodes. We also show that the degree distribution approaches a scale invariant state as the number of genes in the network increases, although real genomes still present finite size effects. Based on the experimental evidence unveiled by these data analyses, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated, with a duplication probability that linearly grows with gene degree and decreases with its clustering coefficient. The results show that topological distributions are better described than in previous genome evolution models. This model correctly predicts that, in order to produce protein-protein association networks with number of links and number of nodes in the observed range, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. If this scenario is true, it implies a universal mechanism for genome evolution

    BacillOndex: An Integrated Data Resource for Systems and Synthetic Biology

    Get PDF
    BacillOndex is an extension of the Ondex data integration system, providing a semantically annotated, integrated knowledge base for the model Gram-positive bacterium Bacillus subtilis. This application allows a user to mine a variety of B. subtilis data sources, and analyse the resulting integrated dataset, which contains data about genes, gene products and their interactions. The data can be analysed either manually, by browsing using Ondex, or computationally via a Web services interface. We describe the process of creating a BacillOndex instance, and describe the use of the system for the analysis of single nucleotide polymorphisms in B. subtilis Marburg. The Marburg strain is the progenitor of the widely-used laboratory strain B. subtilis 168. We identified 27 SNPs with predictable phenotypic effects, including genetic traits for known phenotypes. We conclude that BacillOndex is a valuable tool for the systems-level investigation of, and hypothesis generation about, this important biotechnology workhorse. Such understanding contributes to our ability to construct synthetic genetic circuits in this organism

    Knime4Bio: a set of custom nodes for the interpretation of next-generation sequencing data with KNIME†

    Get PDF
    Summary: Analysing large amounts of data generated by next-generation sequencing (NGS) technologies is difficult for researchers or clinicians without computational skills. They are often compelled to delegate this task to computer biologists working with command line utilities. The availability of easy-to-use tools will become essential with the generalization of NGS in research and diagnosis. It will enable investigators to handle much more of the analysis. Here, we describe Knime4Bio, a set of custom nodes for the KNIME (The Konstanz Information Miner) interactive graphical workbench, for the interpretation of large biological datasets. We demonstrate that this tool can be utilized to quickly retrieve previously published scientific findings

    DIMA 2.0—predicted and known domain interactions

    Get PDF
    DIMA—the domain interaction map has evolved from a simple web server for domain phylogenetic profiling into an integrative prediction resource combining both experimental data on domain–domain interactions and predictions from two different algorithms. With this update, DIMA obtains greatly improved coverage at the level of genomes and domains as well as with respect to available prediction approaches. The domain phylogenetic profiling method now uses SIMAP as its backend for exhaustive domain hit coverage: 7038 Pfam domains were profiled over 460 completely sequenced genomes.Domain pair exclusion predictions were produced from 83 969 distinct protein–protein interactions obtained from IntAct resulting in 21 513 domain pairs with significant domain pair exclusion algorithm scores. Additional predictions applying the same algorithm to predicted protein interactions from STRING yielded 2378 high-confidence pairs. Experimental data comes from iPfam (3074) and 3did (3034 pairs), two databases identifying domain contacts in solved protein structures. Taken together, these two resources yielded 3653 distinct interacting domain pairs. DIMA is available at http://mips.gsf.de/genre/proj/dima

    STITCH: interaction networks of chemicals and proteins

    Get PDF
    The knowledge about interactions between proteins and small molecules is essential for the understanding of molecular and cellular functions. However, information on such interactions is widely dispersed across numerous databases and the literature. To facilitate access to this data, STITCH (‘search tool for interactions of chemicals’) integrates information about interactions from metabolic pathways, crystal structures, binding experiments and drug–target relationships. Inferred information from phenotypic effects, text mining and chemical structure similarity is used to predict relations between chemicals. STITCH further allows exploring the network of chemical relations, also in the context of associated binding proteins. Each proposed interaction can be traced back to the original data sources. Our database contains interaction information for over 68 000 different chemicals, including 2200 drugs, and connects them to 1.5 million genes across 373 genomes and their interactions contained in the STRING database. STITCH is available at http://stitch.embl.de
    corecore