548 research outputs found
The Genomic Pattern of tDNA Operon Expression in E. coli
In fast-growing microorganisms, a tRNA concentration profile enriched in major isoacceptors selects for the biased usage of cognate codons. This optimizes translational rate for the least mass invested in the translational apparatus. Such translational streamlining is thought to be growth-regulated, but its genetic basis is poorly understood. First, we found in reanalysis of the E. coli tRNA profile that the degree to which it is translationally streamlined is nearly invariant with growth rate. Then, using least squares multiple regression, we partitioned tRNA isoacceptor pools to predicted tDNA operons from the E. coli K12 genome. Co-expression of tDNAs in operons explains the tRNA profile significantly better than tDNA gene dosage alone. Also, operon expression increases significantly with proximity to the origin of replication, oriC, at all growth rates. Genome location explains about 15% of expression variation in a form, at a given growth rate, that is consistent with replication-dependent gene concentration effects. Yet the change in the tRNA profile with growth rate is less than would be expected from such effects. We estimated per-copy expression rates for all tDNA operons that were consistent with independent estimates for rDNA operons. We also found that tDNA operon location, and the location dependence of expression, were significantly different in the leading and lagging strands. The operonic organization and genomic location of tDNA operons are significant factors influencing their expression. Nonrandom patterns of location and strandedness shown by tDNA operons in E. coli suggest that their genomic architecture may be under selection to satisfy physiological demand for tRNA expression at high growth rates
Simulation and analysis of in vitro DNA evolution
We study theoretically the in vitro evolution of a DNA sequence by binding to
a transcription factor. Using a simple model of protein-DNA binding and
available binding constants for the Mnt protein, we perform large-scale,
realistic simulations of evolution starting from a single DNA sequence. We
identify different parameter regimes characterized by distinct evolutionary
behaviors. For each regime we find analytical estimates which agree well with
simulation results. For small population sizes, the DNA evolutional path is a
random walk on a smooth landscape. While for large population sizes, the
evolution dynamics can be well described by a mean-field theory. We also study
how the details of the DNA-protein interaction affect the evolution.Comment: 11 pages, 11 figures. Submitted to PNA
A systematic model to predict transcriptional regulatory mechanisms based on overrepresentation of transcription factor binding profiles
An important aspect of understanding a biological pathway is to delineate the transcriptional regulatory mechanisms of the genes involved. Two important tasks are often encountered when studying transcription regulation, i.e., (1) the identification of common transcriptional regulators of a set of coexpressed genes; (2) the identification of genes that are regulated by one or several transcription factors. In this study, a systematic and statistical approach was taken to accomplish these tasks by establishing an integrated model considering all of the promoters and characterized transcription factors (TFs) in the genome. A promoter analysis pipeline (PAP) was developed to implement this approach. PAP was tested using coregulated gene clusters collected from the literature. In most test cases, PAP identified the transcription regulators of the input genes accurately. When compared with chromatin immunoprecipitation experiment data, PAP's predictions are consistent with the experimental observations. When PAP was used to analyze one published expression-profiling data set and two novel coregulated gene sets, PAP was able to generate biologically meaningful hypotheses. Therefore, by taking a systematic approach of considering all promoters and characterized TFs in our model, we were able to make more reliable predictions about the regulation of gene expression in mammalian organisms
A modified bacterial one-hybrid system yields improved quantitative models of transcription factor specificity
We examine the use of high-throughput sequencing on binding sites recovered using a bacterial one-hybrid (B1H) system and find that improved models of transcription factor (TF) binding specificity can be obtained compared to standard methods of sequencing a small subset of the selected clones. We can obtain even more accurate binding models using a modified version of B1H selection method with constrained variation (CV-B1H). However, achieving these improved models using CV-B1H data required the development of a new method of analysis—GRaMS (Growth Rate Modeling of Specificity)—that estimates bacterial growth rates as a function of the quality of the recognition sequence. We benchmark these different methods of motif discovery using Zif268, a well-characterized C2H2 zinc-finger TF on both a 28 bp randomized library for the standard B1H method and on 6 bp randomized library for the CV-B1H method for which 45 different experimental conditions were tested: five time points and three different IPTG and 3-AT concentrations. We find that GRaMS analysis is robust to the different experimental parameters whereas other analysis methods give widely varying results depending on the conditions of the experiment. Finally, we demonstrate that the CV-B1H assay can be performed in liquid media, which produces recognition models that are similar in quality to sequences recovered from selection on solid media
The effects of cognitive defusion and thought distraction on emotional discomfort and believability of negative self-referential thoughts
Previous research has shown that rapid vocal repetition of a one-word version of negative self-referential thought reduces the stimulus functions (e.g., emotional discomfort and believability) associated with that thought. The present study compares the effects of that defusion strategy with thought distraction and distraction-based experimental control tasks on a negative self-referential thought. Non-clinical undergraduates were randomly assigned to one of three protocols. The cognitive defusion condition reduced the emotional discomfort and believability of negative self-referential thoughts significantly greater than comparison conditions. Favorable results were also found for the defusion technique with participants with elevated depressive symptoms
Autoregulation of yeast ribosomal proteins discovered by efficient search for feedback regulation
Post-transcriptional autoregulation of gene expression is common in bacteria but many fewer examples are known in eukaryotes. We used the yeast collection of genes fused to GFP as a rapid screen for examples of feedback regulation in ribosomal proteins by overexpressing a non-regulatable version of a gene and observing the effects on the expression of the GFP-fused version. We tested 95 ribosomal protein genes and found a wide continuum of effects, with 30% showing at least a 3-fold reduction in expression. Two genes, RPS22B and RPL1B, showed over a 10-fold repression. In both cases the cis-regulatory segment resides in the 5\u27 UTR of the gene as shown by placing that segment of the mRNA upstream of GFP alone and demonstrating it is sufficient to cause repression of GFP when the protein is over-expressed. Further analyses showed that the intron in the 5\u27 UTR of RPS22B is required for regulation, presumably because the protein inhibits splicing that is necessary for translation. The 5\u27 UTR of RPL1B contains a sequence and structure motif that is conserved in the binding sites of Rpl1 orthologs from bacteria to mammals, and mutations within the motif eliminate repression
Novel Algorithms Reveal Streptococcal Transcriptomes and Clues about Undefined Genes
Bacteria–host interactions are dynamic processes, and understanding transcriptional responses that directly or indirectly regulate the expression of genes involved in initial infection stages would illuminate the molecular events that result in host colonization. We used oligonucleotide microarrays to monitor (in vitro) differential gene expression in group A streptococci during pharyngeal cell adherence, the first overt infection stage. We present neighbor clustering, a new computational method for further analyzing bacterial microarray data that combines two informative characteristics of bacterial genes that share common function or regulation: (1) similar gene expression profiles (i.e., co-expression); and (2) physical proximity of genes on the chromosome. This method identifies statistically significant clusters of co-expressed gene neighbors that potentially share common function or regulation by coupling statistically analyzed gene expression profiles with the chromosomal position of genes. We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application. We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data. Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence
Statistical mechanics of transcription-factor binding site discovery using Hidden Markov Models
Hidden Markov Models (HMMs) are a commonly used tool for inference of
transcription factor (TF) binding sites from DNA sequence data. We exploit the
mathematical equivalence between HMMs for TF binding and the "inverse"
statistical mechanics of hard rods in a one-dimensional disordered potential to
investigate learning in HMMs. We derive analytic expressions for the Fisher
information, a commonly employed measure of confidence in learned parameters,
in the biologically relevant limit where the density of binding sites is low.
We then use techniques from statistical mechanics to derive a scaling principle
relating the specificity (binding energy) of a TF to the minimum amount of
training data necessary to learn it.Comment: 25 pages, 2 figures, 1 table V2 - typos fixed and new references
adde
Recognition models to predict DNA-binding specificities of homeodomain proteins
Motivation: Recognition models for protein-DNA interactions, which allow the prediction of specificity for a DNA-binding domain based only on its sequence or the alteration of specificity through rational design, have long been a goal of computational biology. There has been some progress in constructing useful models, especially for C2H2 zinc finger proteins, but it remains a challenging problem with ample room for improvement. For most families of transcription factors the best available methods utilize k-nearest neighbor (KNN) algorithms to make specificity predictions based on the average of the specificities of the k most similar proteins with defined specificities. Homeodomain (HD) proteins are the second most abundant family of transcription factors, after zinc fingers, in most metazoan genomes, and as a consequence an effective recognition model for this family would facilitate predictive models of many transcriptional regulatory networks within these genomes
A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli
While Escherichia coli has one of the most comprehensive datasets of experimentally verified transcriptional regulatory interactions of any organism, it is still far from complete. This presents a problem when trying to combine gene expression and regulatory interactions to model transcriptional regulatory networks. Using the available regulatory interactions to predict new interactions may lead to better coverage and more accurate models. Here, we develop SEREND (SEmi-supervised REgulatory Network Discoverer), a semi-supervised learning method that uses a curated database of verified transcriptional factor–gene interactions, DNA sequence binding motifs, and a compendium of gene expression data in order to make thousands of new predictions about transcription factor–gene interactions, including whether the transcription factor activates or represses the gene. Using genome-wide binding datasets for several transcription factors, we demonstrate that our semi-supervised classification strategy improves the prediction of targets for a given transcription factor. To further demonstrate the utility of our inferred interactions, we generated a new microarray gene expression dataset for the aerobic to anaerobic shift response in E. coli. We used our inferred interactions with the verified interactions to reconstruct a dynamic regulatory network for this response. The network reconstructed when using our inferred interactions was better able to correctly identify known regulators and suggested additional activators and repressors as having important roles during the aerobic–anaerobic shift interface
- …