14 research outputs found

    Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes

    Get PDF
    Abstract Background Bacterial promoters, which increase the efficiency of gene expression, differ from other promoters by several characteristics. This difference, not yet widely exploited in bioinformatics, looks promising for the development of relevant computational tools to search for strong promoters in bacterial genomes. Results We describe a new triad pattern algorithm that predicts strong promoter candidates in annotated bacterial genomes by matching specific patterns for the group I σ70 factors of Escherichia coli RNA polymerase. It detects promoter-specific motifs by consecutively matching three patterns, consisting of an UP-element, required for interaction with the α subunit, and then optimally-separated patterns of -35 and -10 boxes, required for interaction with the σ70 subunit of RNA polymerase. Analysis of 43 bacterial genomes revealed that the frequency of candidate sequences depends on the A+T content of the DNA under examination. The accuracy of in silico prediction was experimentally validated for the genome of a hyperthermophilic bacterium, Thermotoga maritima, by applying a cell-free expression assay using the predicted strong promoters. In this organism, the strong promoters govern genes for translation, energy metabolism, transport, cell movement, and other as-yet unidentified functions. Conclusion The triad pattern algorithm developed for predicting strong bacterial promoters is well suited for analyzing bacterial genomes with an A+T content of less than 62%. This computational tool opens new prospects for investigating global gene expression, and individual strong promoters in bacteria of medical and/or economic significance.</p

    Phylogeny and structural modeling of the transcription factor CsqR (YihW) from Escherichia coli

    No full text
    Abstract CsqR (YihW) is a local transcription factor that controls expression of yih genes involved in degradation of sulfoquinovose in Escherichia coli. We recently showed that expression of the respective gene cassette might be regulated by lactose. Here, we explore the phylogenetic and functional traits of CsqR. Phylogenetic analysis revealed that CsqR had a conserved Met25. Western blot demonstrated that CsqR was synthesized in the bacterial cell as two protein forms, 28.5 (CsqR-l) and 26 kDa (CsqR-s), the latter corresponding to start of translation at Met25. CsqR-s was dramatically activated during growth with sulfoquinovose as a sole carbon source, and displaced CsqR-l in the stationary phase during growth on rich medium. Molecular dynamic simulations revealed two possible states of the CsqR-s structure, with the interdomain linker being represented by either a disordered loop or an ɑ-helix. This helix allowed the hinge-like motion of the N-terminal domain resulting in a switch of CsqR-s between two conformational states, “open” and “compact”. We then modeled the interaction of both CsqR forms with putative effectors sulfoquinovose, sulforhamnose, sulfoquinovosyl glycerol, and lactose, and revealed that they all preferred the same pocket in CsqR-l, while in CsqR-s there were two possible options dependent on the linker structure

    Identification of Rgg Binding Sites in the Streptococcus pyogenes Chromosome ▿ †

    No full text
    Streptococcus pyogenes Rgg is a regulatory protein that controls the transcription of 588 genes in strain NZ131 during the post-exponential phase of growth, including the virulence-associated genes encoding the extracellular SpeB protease, pullulanase A (PulA), and two extracellular nucleases (SdaB and Spd-3). Rgg binds to DNA proximally to the speB promoter (PspeB) to activate transcription; however, it is not known if Rgg binds to the promoters of other genes to influence expression, or if the perturbation of other global regulons accounts for the genome-wide changes in expression associated with the mutant. To address this issue, chromatin immunoprecipitation followed by DNA microarray analysis (ChIP-chip) was used to identify the DNA binding sites of Rgg. Rgg bound to 65 sites in the chromosome. Thirty-five were within noncoding DNA, and 43% of these were adjacent to genes previously identified as regulated by Rgg. Electrophoretic mobility shift assays were used to assess the binding of Rgg to a subset of sites bound in vivo, including the noncoding DNA upstream of speB, the genes encoding PulA, Spd-3, and a transcriptional regulator (SPY49_1113), and prophage-associated genes encoding a putative integrase (SPY49_0746) and a surface antigen (SPY49_0396). Rgg bound to all target DNAs in vitro, consistent with the in vivo results. Finally, analyses with a transcriptional reporter system showed that the DNA bound by Rgg contained an active promoter that was regulated by Rgg. Overall, the results indicate that Rgg binds specifically to multiple sites in the chromosome, including prophage DNA, to influence gene expression

    Dps shares its binding sites with other proteins of bacterial nucleoid and has affinity to REP-elements and <i>promoter islands</i>.

    No full text
    <p>Intersection of Dps targets (<b>A</b>) and sites unbound by Dps (<b>B</b>) with structural and functional elements of bacterial genome was estimated as described above for repeated sequences and plotted as fold ratio to the expected values. Bent black arrows on the bottom schematically show areas occupied by CS or UR. Gray rectangles and numerals inside indicate the expected number of common base pairs if compared modules are independently distributed along the genome. Gray and colored bent arrows show registered overlap calculated in 1 bp resolution. Numerals in parenthesis indicate the size of compared sets. Genomic locations of REP elements were taken from KEGG DataBase (<a href="http://www.genome.jp/kegg/" target="_blank">http://www.genome.jp/kegg/</a>, [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref063" target="_blank">63</a>]), and fold ratios obtained for 302 REP-sequences containing 1–3 REP-modules (14–100 bp) were plotted. Analyzed ChIP-chip and ChIP-seq data sets were obtained from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref046" target="_blank">46</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref051" target="_blank">51</a>] for cells grown in LB medium (LB), M9 medium with fructose (M9) or MOPS minimal medium with glucose (GMM), harvested at early (EE), middle (ME) or late (LE) exponential phase or upon transition to the steady growth (TS).</p

    Large-scale profiles of the Dps targets correlate with the landscape of direct and inverted repeats and the pattern of Fis binding sites.

    No full text
    <p><b>A:</b> Distribution of the Dps contact regions along the <i>E</i>. <i>coli</i> MG1655 genome identified by CLC GW in two experiments (the default settings). The areas covered by Dps are combined in 100,000 bp bins and plotted as percentage to the total length of all sites occupied by Dps. <b>B</b>: The same for the contact sites of Fis [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref049" target="_blank">49</a>], IHF [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref051" target="_blank">51</a>], H-NS [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref049" target="_blank">49</a>] and RNAP [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref050" target="_blank">50</a>] from the cells grown in conditions similar to those used in our experiments. The plot for Dps shows the distribution of the sites from the combined set (<b>CS</b>) (Page 3, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.s009" target="_blank">S3 Table</a>). <b>C:</b> The same for direct (5–24 bp separated by 1–15 bp) and inverted (5–18 bp separated by 3–20 bp) repeats collected from the genome of <i>E</i>. <i>coli</i> MG1655 using Unipro UGENE [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.ref053" target="_blank">53</a>].</p

    Dps-binding sites are enriched with inverted repeats.

    No full text
    <p>The overlap between sequences of CS (colored box-plots) and UR (gray boxes) with direct or inverted repeats was characterized by the parameter <b>K</b><sub><b>ij</b></sub> as described in the text. Black dot on the right panel shows one outlier. Box-plots with statistically significant differences are provided with corresponding p-values. Regions of bound and unbound sets overlapping with repeated sequences of both types are indicated in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.s009" target="_blank">S3 Table</a>.</p

    Deletion of the <i>dps</i> gene affects <i>rpoA</i> and <i>rpoB</i> expression.

    No full text
    <p><b>A:</b> Profiles of the Dps binding sites obtained in two experiments (indicated) for the genomic region with three operons of ribosomal genes (running window of nine 35 bp bins). Genes are represented by blue horizontal arrows; magenta lines show ribosomal operons. Vertical arrows mark locations of inverted repeats (if longer than 7 bp). <b>B:</b> Band shift assays performed for indicated genomic loci. The regulatory region of the <i>dps</i> gene was used as a positive control for all band shift assays in this study. Fragment from the <i>lacZ</i> coding sequence was used as a reference gene for qRT-PCR experiments. Positioning of primers for amplification (<b>F</b> and <b>R</b>) is indicated in panel <b>A</b> of this figure, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.g006" target="_blank">Fig 6A</a> (for the <i>dps</i> regulatory region) and in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182800#pone.0182800.s003" target="_blank">S3 Fig</a> (for <i>rpoA</i> and <i>lacZ</i>). <b>C:</b> Changes in the expression efficiency of selected genes in response to <i>dps</i> deletion. Primers used for reverse transcription and consecutive PCR are designated as RT and PCR, respectively, here and all other figures. Expression levels were estimated based on 3 and 5 biological samples (3–18 technical repeats in each) for <i>rpoA</i> and <i>rpoD</i>, respectively. Error bars show an average deviation. Statistical significance was assessed using Student’s t-test.</p

    Comparative Genomic Analysis of the Hexuronate Metabolism Genes and Their Regulation in Gammaproteobacteria ▿ †

    Get PDF
    The hexuronate metabolism in Escherichia coli is regulated by two related transcription factors from the FadR subfamily of the GntR family, UxuR and ExuR. UxuR controls the d-glucuronate metabolism, while ExuR represses genes involved in the metabolism of all hexuronates. We use a comparative genomics approach to reconstruct the hexuronate metabolic pathways and transcriptional regulons in gammaproteobacteria. We demonstrate differences in the binding motifs of UxuR and ExuR, identify new candidate members of the UxuR/ExuR regulons, and describe the links between the UxuR/ExuR regulons and the adjacent regulons UidR, KdgR, and YjjM. We provide experimental evidence that two predicted members of the UxuR regulon, yjjM and yjjN, are the subject of complex regulation by this transcription factor in E. coli
    corecore