26 research outputs found

    Prediction of evolutionarily conserved interologs in Mus musculus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of protein-protein interactions is an important first step to understand living systems. High-throughput experimental approaches have accumulated large amount of information on protein-protein interactions in human and other model organisms. Such interaction information has been successfully transferred to other species, in which the experimental data are limited. However, the annotation transfer method could yield false positive interologs due to the lack of conservation of interactions when applied to phylogenetically distant organisms.</p> <p>Results</p> <p>To address this issue, we used phylogenetic profile method to filter false positives in interologs based on the notion that evolutionary conserved interactions show similar patterns of occurrence along the genomes. The approach was applied to <it>Mus musculus</it>, in which the experimentally identified interactions are limited. We first inferred the protein-protein interactions in <it>Mus musculus </it>by using two approaches: i) identifying mouse orthologs of interacting proteins (interologs) based on the experimental protein-protein interaction data from other organisms; and ii) analyzing frequency of mouse ortholog co-occurrence in predicted operons of bacteria. We then filtered possible false-positives in the predicted interactions using the phylogenetic profiles. We found that this filtering method significantly increased the frequency of interacting protein-pairs coexpressed in the same cells/tissues in gene expression omnibus (GEO) database as well as the frequency of interacting protein-pairs shared the similar Gene Ontology (GO) terms for biological processes and cellular localizations. The data supports the notion that phylogenetic profile helps to reduce the number of false positives in interologs.</p> <p>Conclusion</p> <p>We have developed protein-protein interaction database in mouse, which contains 41109 interologs. We have also developed a web interface to facilitate the use of database <url>http://lgsun.grc.nia.nih.gov/mppi/</url>.</p

    iCR: a web tool to identify conserved targets of a regulatory protein across the multiple related prokaryotic species

    Get PDF
    Gene regulatory circuits are often commonly shared between two closely related organisms. Our web tool iCR (identify Conserved target of a Regulon) makes use of this fact and identify conserved targets of a regulatory protein. iCR is a special refined extension of our previous tool PredictRegulon- that predicts genome wide, the potential binding sites and target operons of a regulatory protein in a single user selected genome. Like PredictRegulon, the iCR accepts known binding sites of a regulatory protein as ungapped multiple sequence alignment and provides the potential binding sites. However important differences are that the user can select more than one genome at a time and the output reports the genes that are common in two or more species. In order to achieve this, iCR makes use of Cluster of Orthologous Group (COG) indices for the genes. This tool analyses the upstream region of all user-selected prokaryote genome and gives the output based on conservation target orthologs. iCR also reports the Functional class codes based on COG classification for the encoded proteins of downstream genes which helps user understand the nature of the co-regulated genes at the result page itself. iCR is freely accessible at

    Prediction of DtxR regulon: Identification of binding sites and operons controlled by Diphtheria toxin repressor in Corynebacterium diphtheriae

    Get PDF
    BACKGROUND: The diphtheria toxin repressor, DtxR, of Corynebacterium diphtheriae has been shown to be an iron-activated transcription regulator that controls not only the expression of diphtheria toxin but also of iron uptake genes. This study aims to identify putative binding sites and operons controlled by DtxR to understand the role of DtxR in patho-physiology of Corynebacterium diphtheriae. RESULT: Positional Shannon relative entropy method was used to build the DtxR-binding site recognition profile and the later was used to identify putative regulatory sites of DtxR within C. diphtheriae genome. In addition, DtxR-regulated operons were also identified taking into account the predicted DtxR regulatory sites and genome annotation. Few of the predicted motifs were experimentally validated by electrophoretic mobility shift assay. The analysis identifies motifs upstream to the novel iron-regulated genes that code for Formamidopyrimidine-DNA glycosylase (FpG), an enzyme involved in DNA-repair and starvation inducible DNA-binding protein (Dps) which is involved in iron storage and oxidative stress defense. In addition, we have found the DtxR motifs upstream to the genes that code for sortase which catalyzes anchoring of host-interacting proteins to the cell wall of pathogenic bacteria and the proteins of secretory system which could be involved in translocation of various iron-regulated virulence factors including diphtheria toxin. CONCLUSIONS: We have used an in silico approach to identify the putative binding sites and genes controlled by DtxR in Corynebacterium diphtheriae. Our analysis shows that DtxR could provide a molecular link between Fe(+2)-induced Fenton's reaction and protection of DNA from oxidative damage. DtxR-regulated Dps prevents lethal combination of Fe(+2 )and H(2)O(2 )and also protects DNA by nonspecific DNA-binding. In addition DtxR could play an important role in host interaction and virulence by regulating the levels of sortase, a potential vaccine candidate and proteins of secretory system

    DOMINE: a comprehensive collection of known and predicted domain-domain interactions

    Get PDF
    DOMINE is a comprehensive collection of known and predicted domain–domain interactions (DDIs) compiled from 15 different sources. The updated DOMINE includes 2285 new domain–domain interactions (DDIs) inferred from experimentally characterized high-resolution three-dimensional structures, and about 3500 novel predictions by five computational approaches published over the last 3 years. These additions bring the total number of unique DDIs in the updated version to 26 219 among 5140 unique Pfam domains, a 23% increase compared to 20 513 unique DDIs among 4346 unique domains in the previous version. The updated version now contains 6634 known DDIs, and features a new classification scheme to assign confidence levels to predicted DDIs. DOMINE will serve as a valuable resource to those studying protein and domain interactions. Most importantly, DOMINE will not only serve as an excellent reference to bench scientists testing for new interactions but also to bioinformaticans seeking to predict novel protein–protein interactions based on the DDIs. The contents of the DOMINE are available at http://domine.utdallas.edu

    Acute depletion of Tet1-dependent 5-hydroxymethylcytosine levels impairs LIF/Stat3 signaling and results in loss of embryonic stem cell identity

    Get PDF
    The TET family of FE(II) and 2-oxoglutarate-dependent enzymes (Tet1/2/3) promote DNA demethylation by converting 5-methylcytosine to 5-hydroxymethylcytosine (5hmC), which they further oxidize into 5-formylcytosine and 5-carboxylcytosine. Tet1 is robustly expressed in mouse embryonic stem cells (mESCs) and has been implicated in mESC maintenance. Here we demonstrate that, unlike genetic deletion, RNAi-mediated depletion of Tet1 in mESCs led to a significant reduction in 5hmC and loss of mESC identity. The differentiation phenotype due to Tet1 depletion positively correlated with the extent of 5hmC loss. Meta-analyses of genomic data sets suggested interaction between Tet1 and leukemia inhibitory factor (LIF) signaling. LIF signaling is known to promote self-renewal and pluripotency in mESCs partly by opposing MAPK/ERK-mediated differentiation. Withdrawal of LIF leads to differentiation of mESCs. We discovered that Tet1 depletion impaired LIF-dependent Stat3-mediated gene activation by affecting Stat3's ability to bind to its target sites on chromatin. Nanog overexpression or inhibition of MAPK/ERK signaling, both known to maintain mESCs in the absence of LIF, rescued Tet1 depletion, further supporting the dependence of LIF/Stat3 signaling on Tet1. These data support the conclusion that analysis of mESCs in the hours/days immediately following efficient Tet1 depletion reveals Tet1's normal physiological role in maintaining the pluripotent state that may be subject to homeostatic compensation in genetic models

    Inferring genome-wide functional linkages in E. coli by combining improved genome context methods: Comparison with high-throughput experimental data

    No full text
    Cellular functions are determined by interactions among proteins in the cells. Recognition of these interactions forms an important step in understanding biology at the systems level. Here, we report an interaction network of Escherichia coli, obtained by training a Support Vector Machine on the high quality of interactions in the EcoCyc database, and with the assumption that the periplasmic and cytoplasmic proteins may not interact with each other. The data features included correlation coefficient between bit score phylogenetic profiles, frequency of their co-occurrence in predicted operons, and a new measure—the distance between translational start sites of the genes. The combined genome context methods show a high accuracy of prediction on the test data and predict a total of 78,122 binary interactions. The majority of the interactions identified by high-throughput experimental methods correspond to indirect interaction (interactions through neighbors) in the predicted network. Correlation of the predicted network with the gene essentiality data shows that the essential genes in E. coli exhibit a high linking number, whereas the nonessential genes exhibit a low linking number. Furthermore, our predicted protein–protein interaction network shows that the proteins involved in replication, DNA repair, transcription, translation, and cell wall synthesis are highly connected. We therefore believe that our predicted network will serve as a useful resource in understanding prokaryotic biology

    Comparing transcription rate and mRNA abundance as parameters for biochemical pathway and network analysis

    Get PDF
    The cells adapt to extra- and intra-cellular signals by dynamic orchestration of activities of pathways in the biochemical networks. Dynamic control of the gene expression process represents a major mechanism for pathway activity regulation. Gene expression has thus been routinely measured, most frequently at steady-state mRNA abundance level using microarray technology. The results are widely used in statistical inference of the structures of underlying biochemical networks, with the assumption that functionally related genes exhibit similar dynamic profiles. Steady-state mRNA abundance, however, is a composite of two factors: transcription rate and mRNA degradation rate. The question being asked here is therefore whether steady-state mRNA abundance or any of two factors is a more informative measurement target for studying network dynamics. The yeast S. cerevisiae was used as model organism and transcription rate was chosen out of the two factors in this study, because genome-wide determination of transcription rates has been reported for several physiological processes in this species. Our strategy is to test which one is a better measurement of functional relatedness between genes. The analysis was performed on those S. cerevisiae genes that have bacterial orthologs as identified by reciprocal BLAST analysis, so that functional relatedness of a gene pair can be measured by the frequency at which their bacterial orthologs co-occur in the same operon in the collection of bacterial genomes. It is found that transcription rate data is generally a better parameter for functional relatedness than steady state mRNA abundance, suggesting transcriptio

    Tanscription rate data correlates better with co-operon frequency.

    No full text
    <p>Relationship between transcription rate correlation (TR), as well as mRNA abundance level correlation (RA), and co-operon frequency for <i>S. cerevisiae</i> gene pairs whose bacterial orthologs co-occur in the same operons are shown. For each column, the top plot represents the fraction of points above 0.6 as cooperon frequency increases; the bottom plot shows the ratio between the number of points above 0.6 and below −0.6 with increasing cooperon frequency. The bottom plot ends at cooperon frequency where the number of point below −0.6 becomes 0 in the TR data, as the cooperon frequency where the number of point below −0.6 becomes 0 is always higher in the RA data. Both plots are given relative (folds of change) to the first time point. Column A) gives plots for galactose-glucose shift data; column B) gives plots for oxidative stress data; column C) gives plots for osmotic stress data.</p

    A time-delay effect from transcription rate data to mRNA abundance data.

    No full text
    <p>A) Correlation matrix between time points for mRNA abundance (RA) and transcription rate (TR) analysis under an osmotic stress condition is shown; B) Only correlations between TR and RA shown.</p
    corecore