118 research outputs found

    A Plausible Role for the Presence of Internal Shine-Dalgarno Sites

    Get PDF
    The presence of nucleotide hybridization between the 3′ end of 16S rRNA and mRNA sequence upstream of the start codon is well known in bacteria. In this paper, we detect the presence of such hybridization sites inside the coding regions of E. coli genes, and analyze their proximity to clusters of slow-translating codons. We study this phenomenon in genes of high and low expression separately. Based on our findings, we propose an explanation for the presence of RNA hybridization within the translated regions of bacterial genes

    PairWise Neighbours database: overlaps and spacers among prokaryote genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although prokaryotes live in a variety of habitats and possess different metabolic and genomic complexity, they have several genomic architectural features in common. The overlapping genes are a common feature of the prokaryote genomes. The overlapping lengths tend to be short because as the overlaps become longer they have more risk of deleterious mutations. The spacers between genes tend to be short too because of the tendency to reduce the non coding DNA among prokaryotes. However they must be long enough to maintain essential regulatory signals such as the Shine-Dalgarno (SD) sequence, which is responsible of an efficient translation.</p> <p>Description</p> <p>PairWise Neighbours is an interactive and intuitive database used for retrieving information about the spacers and overlapping genes among bacterial and archaeal genomes. It contains 1,956,294 gene pairs from 678 fully sequenced prokaryote genomes and is freely available at the URL <url>http://genomes.urv.cat/pwneigh</url>. This database provides information about the overlaps and their conservation across species. Furthermore, it allows the wide analysis of the intergenic regions providing useful information such as the location and strength of the SD sequence.</p> <p>Conclusion</p> <p>There are experiments and bioinformatic analysis that rely on correct annotations of the initiation site. Therefore, a database that studies the overlaps and spacers among prokaryotes appears to be desirable. PairWise Neighbours database permits the reliability analysis of the overlapping structures and the study of the SD presence and location among the adjacent genes, which may help to check the annotation of the initiation sites.</p

    Mathematical modeling of translation initiation for the estimation of its efficiency to computationally design mRNA sequences with desired expression levels in prokaryotes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Within the emerging field of synthetic biology, engineering paradigms have recently been used to design biological systems with novel functionalities. One of the essential challenges hampering the construction of such systems is the need to precisely optimize protein expression levels for robust operation. However, it is difficult to design mRNA sequences for expression at targeted protein levels, since even a few nucleotide modifications around the start codon may alter translational efficiency and dramatically (up to 250-fold) change protein expression. Previous studies have used <it>ad hoc </it>approaches (e.g., random mutagenesis) to obtain the desired translational efficiencies for mRNA sequences. Hence, the development of a mathematical methodology capable of estimating translational efficiency would greatly facilitate the future design of mRNA sequences aimed at yielding desired protein expression levels.</p> <p>Results</p> <p>We herein propose a mathematical model that focuses on translation initiation, which is the rate-limiting step in translation. The model uses mRNA-folding dynamics and ribosome-binding dynamics to estimate translational efficiencies solely from mRNA sequence information. We confirmed the feasibility of our model using previously reported expression data on the MS2 coat protein. For further confirmation, we used our model to design 22 <it>luxR </it>mRNA sequences predicted to have diverse translation efficiencies ranging from 10<sup>-5 </sup>to 1. The expression levels of these sequences were measured in <it>Escherichia coli </it>and found to be highly correlated (<it>R</it><sup><it>2 </it></sup>= 0.87) with their estimated translational efficiencies. Moreover, we used our computational method to successfully transform a low-expressing DsRed2 mRNA sequence into a high-expressing mRNA sequence by maximizing its translational efficiency through the modification of only eight nucleotides upstream of the start codon.</p> <p>Conclusions</p> <p>We herein describe a mathematical model that uses mRNA sequence information to estimate translational efficiency. This model could be used to design best-fit mRNA sequences having a desired protein expression level, thereby facilitating protein over-production in biotechnology or the protein expression-level optimization necessary for the construction of robust networks in synthetic biology.</p

    Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors

    Get PDF
    In prokaryotes, Shine–Dalgarno (SD) sequences, nucleotides upstream from start codons on messenger RNAs (mRNAs) that are complementary to ribosomal RNA (rRNA), facilitate the initiation of protein synthesis. The location of SD sequences relative to start codons and the stability of the hybridization between the mRNA and the rRNA correlate with the rate of synthesis. Thus, accurate characterization of SD sequences enhances our understanding of how an organism's transcriptome relates to its cellular proteome. We implemented the Individual Nearest Neighbor Hydrogen Bond model for oligo–oligo hybridization and created a new metric, relative spacing (RS), to identify both the location and the hybridization potential of SD sequences by simulating the binding between mRNAs and single-stranded 16S rRNA 3′ tails. In 18 prokaryote genomes, we identified 2,420 genes out of 58,550 where the strongest binding in the translation initiation region included the start codon, deviating from the expected location for the SD sequence of five to ten bases upstream. We designated these as RS+1 genes. Additional analysis uncovered an unusual bias of the start codon in that the majority of the RS+1 genes used GUG, not AUG. Furthermore, of the 624 RS+1 genes whose SD sequence was associated with a free energy release of less than −8.4 kcal/mol (strong RS+1 genes), 384 were within 12 nucleotides upstream of in-frame initiation codons. The most likely explanation for the unexpected location of the SD sequence for these 384 genes is mis-annotation of the start codon. In this way, the new RS metric provides an improved method for gene sequence annotation. The remaining strong RS+1 genes appear to have their SD sequences in an unexpected location that includes the start codon. Thus, our RS metric provides a new way to explore the role of rRNA–mRNA nucleotide hybridization in translation initiation

    Highly expressed proteins have an increased frequency of alanine in the second amino acid position

    Get PDF
    BACKGROUND: Although the sequence requirements for translation initiation regions have been frequently analysed, usually the highly expressed genes are not treated as a separate dataset. RESULTS: To investigate this, we analysed the mRNA regions downstream of initiation codons in nine bacteria, three archaea and three unicellular eukaryotes, comparing the dataset of highly expressed genes to the dataset of all genes. In addition to the detailed analysis of the nucleotide and codon frequencies we compared the N-termini of highly expressed proteins to the N-termini of all proteins coded in the genome. CONCLUSION: The most conserved pattern was observed at the amino acid level: strong alanine over-representation was observed at the second amino acid position of highly expressed proteins. This pattern is well conserved in all three domains of life

    Analysis of Free Energy Signals Arising from Nucleotide Hybridization Between rRNA and mRNA Sequences during Translation in Eubacteria

    Get PDF
    A decoding algorithm is tested that mechanistically models the progressive alignments that arise as the mRNA moves past the rRNA tail during translation elongation. Each of these alignments provides an opportunity for hybridization between the single-stranded, 3′-terminal nucleotides of the 16S rRNA and the spatially accessible window of mRNA sequence, from which a free energy value can be calculated. Using this algorithm we show that a periodic, energetic pattern of frequency 1/3 is revealed. This periodic signal exists in the majority of coding regions of eubacterial genes, but not in the non-coding regions encoding the 16S and 23S rRNAs. Signal analysis reveals that the population of coding regions of each bacterial species has a mean phase that is correlated in a statistically significant way with species (G + C) content. These results suggest that the periodic signal could function as a synchronization signal for the maintenance of reading frame and that codon usage provides a mechanism for manipulation of signal phase

    Investigation of the length distributions of coding and noncoding sequences in relation to gene architecture, function, and expression

    Get PDF
    The last 20 years has seen the birth of bioinformatics, and is defined as the combination of mathematics, biology, and computational approaches. This discipline has led to the era of ontology, extensive databases including sequences, structures, expression profiles, and genomes and database cross-referencing, (Ouzounis, 2012). Before this discipline, scientists referenced atlas books, such as Margret Dayhoff’s protein sequence collection (Strasser, 2010) which required long hours of letter counting. Through the development of sequencing technology over the past forty years, a tremendous amount of genomic sequencing data has already been collected. With a surge of such data increasing, so does the challenges of data organisation, accessibility and interpretation, with interpretation being the most challenging (Ouzounis, 2012)

    Analysis and Prediction of Translation Rate Based on Sequence and Functional Features of the mRNA

    Get PDF
    Protein concentrations depend not only on the mRNA level, but also on the translation rate and the degradation rate. Prediction of mRNA's translation rate would provide valuable information for in-depth understanding of the translation mechanism and dynamic proteome. In this study, we developed a new computational model to predict the translation rate, featured by (1) integrating various sequence-derived and functional features, (2) applying the maximum relevance & minimum redundancy method and incremental feature selection to select features to optimize the prediction model, and (3) being able to predict the translation rate of RNA into high or low translation rate category. The prediction accuracies under rich and starvation condition were 68.8% and 70.0%, respectively, evaluated by jackknife cross-validation. It was found that the following features were correlated with translation rate: codon usage frequency, some gene ontology enrichment scores, number of RNA binding proteins known to bind its mRNA product, coding sequence length, protein abundance and 5′UTR free energy. These findings might provide useful information for understanding the mechanisms of translation and dynamic proteome. Our translation rate prediction model might become a high throughput tool for annotating the translation rate of mRNAs in large-scale

    Analysis of cis-acting expression determinants of the tobacco psbA 5’UTR in vivo

    Get PDF
    Chloroplast gene expression is predominantly regulated at the posttranscriptional levels of mRNA stability and translation efficiency. The expression of psbA, an important photosynthesis-related chloroplast gene, has been revealed to be regulated via its 5’- untranslated region (UTR). Some cis-acting elements within this 5’UTR and the correlated trans-acting factors have been defined in Chlamydomonas. However, no in vivo evidence with respect to the cis-acting elements of the psbA 5’UTR has been so far achieved in higher plants such as tobacco. To attempt this, we generated a series of mutants of the tobacco psbA 5’UTR by base alterations and sequence deletions, with special regard to the stem-loop structure and the putative target sites for ribosome association and binding of nuclear regulatory factors. In addition, a versatile plastid transformation vector pKCZ with an insertion site in the inverted repeat region of the plastid genome was constructed. In all constructs, the psbA 5’UTR (Wt or modified) was used as the 5’ leader of the reporter gene uidA under control of the same promoter, Prrn, the promoter of the rRNA operon. Through biolistic DNA delivery to tobacco chloroplasts, transplastomic plants were obtained. DNA and RNA analyses of these transplastomic plants demonstrated that the transgenes aadA and uidA had been correctly integrated into the plastome at the insertion site, and transcribed in discrete sizes. Quantitative assays were also done to determine the proportion of intact transplastome, the uidA mRNA level, Gus activity, and uidA translation efficiency. The main results are the following: 1) The insertion site at the unique MunI between two tRNA genes (trnR-ACG and trnNGUU) is functional. Vector pKCZ has a large flexibility for further DNA manipulations and hence is useful for future applications. 2) The stem-loop of the psbA 5’UTR is required for mRNA stabilisation and translation. All mutants related to this region showed a 2~3 fold decrease in mRNA stability and a 1.5~6 fold reduction in translation efficiency. The function of this stem-loop depends on its correct sequence and secondary conformation. 3) the AU-box of the psbA 5’UTR is a crucial translation determinant. Mutations of this element almost abolished translation efficacy (up to 175-fold decrease), but did not significantly affect mRNA accumulation. The regulatory role of the AU-Box is sequencedependent and might be affected by its inner secondary structure. 4) The internal AUG codon of the psbA 5’UTR is unable to initiate translation. An introduction of mRNA translatability from this codon failed to direct the translation of reporter uidA gene, overriding the mutation of the AU-Box. 5) The 5’end poly(A) sequence does not confer a distinct regulatory signal. The deletion of this element only insignificantly affected mRNA abundance and translation. However, this mutation might slightly disturb the conformation of the stem-loop, resulting in a moderate decrease in translation efficiency (~1.5 fold). 6) The SD(Shine-Dalgarno)-like RBS (ribosome binding site) of the psbA 5’UTR appears to be an indispensable element for translation initiation. Mutation of this element led to a dramatically low expression of the uidA gene as seen by Gus staining. 7) The 5’end structural sequence of the rbcL 5’UTR does not convey a high mRNA stabilising effect to the psbA 5’UTR in a cycling condition of the light and the dark. Their distinct roles appear to be involved in darkness adaptation. Furthermore, with respect to the overall regulatory function of the psbA 5’UTR, two models are proposed, i.e. dual RBS-mediated translation initiation, and cpRBPs-mediated mRNA stability and translation. The mechanisms for mRNA stabilisation entailed by the rbcL 5’UTR are also discussed. Direct repeat-mediated transgene loss after chloroplast transformation and other aspects related to the choice of insertion site and plastid promoter are also analysed
    corecore