212 research outputs found

    Development and analysis of an in vivo-compatible metabolic network of Mycobacterium tuberculosis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>During infection, <it>Mycobacterium tuberculosis </it>confronts a generally hostile and nutrient-poor <it>in vivo </it>host environment. Existing models and analyses of <it>M. tuberculosis </it>metabolic networks are able to reproduce experimentally measured cellular growth rates and identify genes required for growth in a range of different <it>in vitro </it>media. However, these models, under <it>in vitro </it>conditions, do not provide an adequate description of the metabolic processes required by the pathogen to infect and persist in a host.</p> <p>Results</p> <p>To better account for the metabolic activity of <it>M. tuberculosis </it>in the host environment, we developed a set of procedures to systematically modify an existing <it>in vitro </it>metabolic network by enhancing the agreement between calculated and <it>in vivo-</it>measured gene essentiality data. After our modifications, the new <it>in vivo </it>network contained 663 genes, 838 metabolites, and 1,049 reactions and had a significantly increased sensitivity (0.81) in predicted gene essentiality than the <it>in vitro </it>network (0.31). We verified the modifications generated from the purely computational analysis through a review of the literature and found, for example, that, as the analysis suggested, lipids are used as the main source for carbon metabolism and oxygen must be available for the pathogen under <it>in vivo </it>conditions. Moreover, we used the developed <it>in vivo </it>network to predict the effects of double-gene deletions on <it>M. tuberculosis </it>growth in the host environment, explore metabolic adaptations to life in an acidic environment, highlight the importance of different enzymes in the tricarboxylic acid-cycle under different limiting nutrient conditions, investigate the effects of inhibiting multiple reactions, and look at the importance of both aerobic and anaerobic cellular respiration during infection.</p> <p>Conclusions</p> <p>The network modifications we implemented suggest a distinctive set of metabolic conditions and requirements faced by <it>M. tuberculosis </it>during host infection compared with <it>in vitro </it>growth. Likewise, the double-gene deletion calculations highlight the importance of specific metabolic pathways used by the pathogen in the host environment. The newly constructed network provides a quantitative model to study the metabolism and associated drug targets of <it>M. tuberculosis </it>under <it>in vivo </it>conditions.</p

    SNIT: SNP identification for strain typing

    Get PDF
    With ever-increasing numbers of microbial genomes being sequenced, efficient tools are needed to perform strain-level identification of any newly sequenced genome. Here, we present the SNP identification for strain typing (SNIT) pipeline, a fast and accurate software system that compares a newly sequenced bacterial genome with other genomes of the same species to identify single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels). Based on this information, the pipeline analyzes the polymorphic loci present in all input genomes to identify the genome that has the fewest differences with the newly sequenced genome. Similarly, for each of the other genomes, SNIT identifies the input genome with the fewest differences. Results from five bacterial species show that the SNIT pipeline identifies the correct closest neighbor with 75% to 100% accuracy. The SNIT pipeline is available for download at http://www.bhsai.org/snit.htm

    Influence of Protein Abundance on High-Throughput Protein-Protein Interaction Detection

    Get PDF
    Experimental protein-protein interaction (PPI) networks are increasingly being exploited in diverse ways for biological discovery. Accordingly, it is vital to discern their underlying natures by identifying and classifying the various types of deterministic (specific) and probabilistic (nonspecific) interactions detected. To this end, we have analyzed PPI networks determined using a range of high-throughput experimental techniques with the aim of systematically quantifying any biases that arise from the varying cellular abundances of the proteins. We confirm that PPI networks determined using affinity purification methods for yeast and Eschericia coli incorporate a correlation between protein degree, or number of interactions, and cellular abundance. The observed correlations are small but statistically significant and occur in both unprocessed (raw) and processed (high-confidence) data sets. In contrast, the yeast two-hybrid system yields networks that contain no such relationship. While previously commented based on mRNA abundance, our more extensive analysis based on protein abundance confirms a systematic difference between PPI networks determined from the two technologies. We additionally demonstrate that the centrality-lethality rule, which implies that higher-degree proteins are more likely to be essential, may be misleading, as protein abundance measurements identify essential proteins to be more prevalent than nonessential proteins. In fact, we generally find that when there is a degree/abundance correlation, the degree distributions of nonessential and essential proteins are also disparate. Conversely, when there is no degree/abundance correlation, the degree distributions of nonessential and essential proteins are not different. However, we show that essentiality manifests itself as a biological property in all of the yeast PPI networks investigated here via enrichments of interactions between essential proteins. These findings provide valuable insights into the underlying natures of the various high-throughput technologies utilized to detect PPIs and should lead to more effective strategies for the inference and analysis of high-quality PPI data sets

    Probing the Extent of Randomness in Protein Interaction Networks

    Get PDF
    Protein–protein interaction (PPI) networks are commonly explored for the identification of distinctive biological traits, such as pathways, modules, and functional motifs. In this respect, understanding the underlying network structure is vital to assess the significance of any discovered features. We recently demonstrated that PPI networks show degree-weighted behavior, whereby the probability of interaction between two proteins is generally proportional to the product of their numbers of interacting partners or degrees. It was surmised that degree-weighted behavior is a characteristic of randomness. We expand upon these findings by developing a random, degree-weighted, network model and show that eight PPI networks determined from single high-throughput (HT) experiments have global and local properties that are consistent with this model. The apparent random connectivity in HT PPI networks is counter-intuitive with respect to their observed degree distributions; however, we resolve this discrepancy by introducing a non-network-based model for the evolution of protein degrees or “binding affinities.” This mechanism is based on duplication and random mutation, for which the degree distribution converges to a steady state that is identical to one obtained by averaging over the eight HT PPI networks. The results imply that the degrees and connectivities incorporated in HT PPI networks are characteristic of unbiased interactions between proteins that have varying individual binding affinities. These findings corroborate the observation that curated and high-confidence PPI networks are distinct from HT PPI networks and not consistent with a random connectivity. These results provide an avenue to discern indiscriminate organizations in biological networks and suggest caution in the analysis of curated and high-confidence networks

    DOVIS: an implementation for high-throughput virtual screening using AutoDock

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecular-docking-based virtual screening is an important tool in drug discovery that is used to significantly reduce the number of possible chemical compounds to be investigated. In addition to the selection of a sound docking strategy with appropriate scoring functions, another technical challenge is to <it>in silico </it>screen millions of compounds in a reasonable time. To meet this challenge, it is necessary to use high performance computing (HPC) platforms and techniques. However, the development of an integrated HPC system that makes efficient use of its elements is not trivial.</p> <p>Results</p> <p>We have developed an application termed DOVIS that uses AutoDock (version 3) as the docking engine and runs in parallel on a Linux cluster. DOVIS can efficiently dock large numbers (millions) of small molecules (ligands) to a receptor, screening 500 to 1,000 compounds per processor per day. Furthermore, in DOVIS, the docking session is fully integrated and automated in that the inputs are specified via a graphical user interface, the calculations are fully integrated with a Linux cluster queuing system for parallel processing, and the results can be visualized and queried.</p> <p>Conclusion</p> <p>DOVIS removes most of the complexities and organizational problems associated with large-scale high-throughput virtual screening, and provides a convenient and efficient solution for AutoDock users to use this software in a Linux cluster platform.</p

    A Real-Time Algorithm for Predicting Core Temperature in Humans

    Full text link

    The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation

    Get PDF
    BACKGROUND: Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, integrated systems usually do not provide mechanisms to generate customized databases to predict particular protein functions. Here, we describe a tool termed PIPA (Pipeline for Protein Annotation) that has these capabilities. RESULTS: PIPA annotates protein functions by combining the results of multiple programs and databases, such as InterPro and the Conserved Domains Database, into common Gene Ontology (GO) terms. The major algorithms implemented in PIPA are: (1) a profile database generation algorithm, which generates customized profile databases to predict particular protein functions, (2) an automated ontology mapping generation algorithm, which maps various classification schemes into GO, and (3) a consensus algorithm to reconcile annotations from the integrated programs and databases. PIPA's profile generation algorithm is employed to construct the enzyme profile database CatFam, which predicts catalytic functions described by Enzyme Commission (EC) numbers. Validation tests show that CatFam yields average recall and precision larger than 95.0%. CatFam is integrated with PIPA. We use an association rule mining algorithm to automatically generate mappings between terms of two ontologies from annotated sample proteins. Incorporating the ontologies' hierarchical topology into the algorithm increases the number of generated mappings. In particular, it generates 40.0% additional mappings from the Clusters of Orthologous Groups (COG) to EC numbers and a six-fold increase in mappings from COG to GO terms. The mappings to EC numbers show a very high precision (99.8%) and recall (96.6%), while the mappings to GO terms show moderate precision (80.0%) and low recall (33.0%). Our consensus algorithm for GO annotation is based on the computation and propagation of likelihood scores associated with GO terms. The test results suggest that, for a given recall, the application of the consensus algorithm yields higher precision than when consensus is not used. CONCLUSION: The algorithms implemented in PIPA provide automated genome-wide protein function annotation based on reconciled predictions from multiple resources
    corecore