17,017 research outputs found

    Ribosome signatures aid bacterial translation initiation site identification

    Get PDF
    Background: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. Results: Utilising these signatures in combination with nucleotide sequence information, we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic translatomes, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and extensions of previously annotated coding sequences. These re-annotations are supported by the presence of structural and sequence-based features next to N-terminal peptide evidence. Finally, our model identifies 61 novel genes previously undiscovered in the Salmonella enterica genome. Conclusions: Signatures within ribosome profiling read length distributions can be used in combination with nucleotide sequence information to provide accurate genome-wide identification of translation initiation sites

    Discovery of noncanonical translation initiation sites through mass spectrometric analysis of protein N termini

    Get PDF
    Translation initiation generally occurs at AUG codons in eukaryotes, although it has been shown that non-AUG or non-canonical translation initiation can also occur. However, the evidence for noncanonical translation initiation sites (TISs) is largely indirect and based on ribosome profiling (Ribo-seq) studies. Here, using a strategy specifically designed to enrich N termini of proteins, we demonstrate that many human proteins are translated at noncanonical TISs. The large majority of TISs that mapped to 5' untranslated regions were noncanonical and led to N-terminal extension of annotated proteins or translation of upstream small open reading frames (uORF). It has been controversial whether the amino acid corresponding to the start codon is incorporated at the TIS or methionine is still incorporated. We found that methionine was incorporated at almost all noncanonical TISs identified in this study. Comparison of the TISs determined through mass spectrometry with ribosome profiling data revealed that about two-thirds of the novel annotations were indeed supported by the available ribosome profiling data. Sequence conservation across species and a higher abundance of noncanonical TISs than canonical ones in some cases suggests that the noncanonical TISs can have biological functions. Overall, this study provides evidence of protein translation initiation at noncanonical TISs and argues that further studies are required for elucidation of functional implications of such noncanonical translation initiation

    Long non-coding RNA SOX2OT: Expression signature, splicing patterns, and emerging roles in pluripotency and tumorigenesis

    Get PDF
    SOX2 overlapping transcript (SOX2OT) is a long non-coding RNA which harbors one of the major regulators of pluripotency, SOX2 gene, in its intronic region. SOX2OT gene is mapped to human chromosome 3q26.3 (Chr3q26.3) locus and is extended in a high conserved region of over 700 kb. Little is known about the exact role of SOX2OT; however, recent studies have demonstrated a positive role for it in transcription regulation of SOX2 gene. Similar to SOX2, SOX2OT is highly expressed in embryonic stem cells and down-regulated upon the induction of differentiation. SOX2OT is dynamically regulated during the embryogenesis of vertebrates, and delimited to the brain in adult mice and human. Recently, the disregulation of SOX2OT expression and its concomitant expression with SOX2 have become highlighted in some somatic cancers including esophageal squamous cell carcinoma, lung squamous cell carcinoma, and breast cancer. Interestingly, SOX2OT is differentially spliced into multiple mRNA-like transcripts in stem and cancer cells. In this review, we are describing the structural and functional features of SOX2OT, with an emphasis on its expression signature, its splicing patterns and its critical function in the regulation of SOX2 expression during development and tumorigenesis. © 2015 Shahryari, Saghaeian Jazi, Samaei and Mowla

    Balancing noise and plasticity in eukaryotic gene expression

    Get PDF
    Coupling the control of expression stochasticity (noise) to the ability of expression change (plasticity) can alter gene function and influence adaptation. A number of factors, such as transcription re-initiation, strong chromatin regulation or genome neighboring organization, underlie this coupling. However, these factors do not necessarily combine in equivalent ways and strengths in all genes. Can we identify then alternative architectures that modulate in distinct ways the linkage of noise and plasticity? Here we first show that strong chromatin regulation, commonly viewed as a source of coupling, can lead to plasticity without noise. The nature of this regulation is relevant too, with plastic but noiseless genes being subjected to general activators whereas plastic and noisy genes experience more specific repression. Contrarily, in genes exhibiting poor transcriptional control, it is translational efficiency what separates noise from plasticity, a pattern related to transcript length. This additionally implies that genome neighboring organization -as modifier- appears only effective in highly plastic genes. In this class, we confirm bidirectional promoters (bipromoters) as a configuration capable to reduce coupling by abating noise but also reveal an important trade-off, since bipromoters also decrease plasticity. This presents ultimately a paradox between intergenic distances and modulation, with short intergenic distances both associated and disassociated to noise at different plasticity levels. Balancing the coupling among different types of expression variability appears as a potential shaping force of genome regulation and organization. This is reflected in the use of different control strategies at genes with different sets of functional constraints

    What Is the Impact of mRNA 5′ TL Heterogeneity on Translational Start Site Selection and the Mammalian Cellular Phenotype?

    Get PDF
    A major determinant in the efficiency of ribosome loading onto mRNAs is the 5′ TL (transcript leader or 5′ UTR). In addition, elements within this region also impact on start site selection demonstrating that it can modulate the protein readout at both quantitative and qualitative levels. With the increasing wealth of data generated by the mining of the mammalian transcriptome, it has become evident that a genes 5′ TL is not homogeneous but actually exhibits significant heterogeneity. This arises due to the utilization of alternative promoters, and is further compounded by significant variability with regards to the precise transcriptional start sites of each (not to mention alternative splicing). Consequently, the transcript for a protein coding gene is not a unique mRNA, but in-fact a complexed quasi-species of variants whose composition may respond to the changing physiological environment of the cell. Here we examine the potential impact of these events with regards to the protein readout

    Discovery of a small arterivirus gene that overlaps the GP5 coding sequence and is important for virus production

    Get PDF
    The arterivirus family (order Nidovirales) of single-stranded, positive-sense RNA viruses includes porcine respiratory and reproductive syndrome virus and equine arteritis virus (EAV). Their replicative enzymes are translated from their genomic RNA, while their seven structural proteins are encoded by a set of small, partially overlapping genes in the genomic 3′-proximal region. The latter are expressed via synthesis of a set of subgenomic mRNAs that, in general, are functionally monocistronic (except for a bicistronic mRNA encoding the E and GP2 proteins). ORF5, which encodes the major glycoprotein GP5, has been used extensively for phylogenetic analyses. However, an in-depth computational analysis now reveals the arterivirus-wide conservation of an additional AUG-initiated ORF, here termed ORF5a, that overlaps the 5′ end of ORF5. The pattern of substitutions across sequence alignments indicated that ORF5a is subject to functional constraints at the amino acid level, while an analysis of substitutions at synonymous sites in ORF5 revealed a greatly reduced frequency of substitution in the portion of ORF5 that is overlapped by ORF5a. The 43–64 aa ORF5a protein and GP5 are probably expressed from the same subgenomic mRNA, via a translation initiation mechanism involving leaky ribosomal scanning. Inactivation of ORF5a expression by reverse genetics yielded a severely crippled EAV mutant, which displayed lower titres and a tiny plaque phenotype. These defects, which could be partially complemented in ORF5a-expressing cells, indicate that the novel protein, which may be the eighth structural protein of arteriviruses, is expressed and important for arterivirus infection

    Discovery of a small arterivirus gene that overlaps the GP5 coding sequence and is important for virus production

    Get PDF
    The arterivirus family (order Nidovirales) of single-stranded, positive-sense RNA viruses includes porcine respiratory and reproductive syndrome virus and equine arteritis virus (EAV). Their replicative enzymes are translated from their genomic RNA, while their seven structural proteins are encoded by a set of small, partially overlapping genes in the genomic 3′-proximal region. The latter are expressed via synthesis of a set of subgenomic mRNAs that, in general, are functionally monocistronic (except for a bicistronic mRNA encoding the E and GP2 proteins). ORF5, which encodes the major glycoprotein GP5, has been used extensively for phylogenetic analyses. However, an in-depth computational analysis now reveals the arterivirus-wide conservation of an additional AUG-initiated ORF, here termed ORF5a, that overlaps the 5′ end of ORF5. The pattern of substitutions across sequence alignments indicated that ORF5a is subject to functional constraints at the amino acid level, while an analysis of substitutions at synonymous sites in ORF5 revealed a greatly reduced frequency of substitution in the portion of ORF5 that is overlapped by ORF5a. The 43–64 aa ORF5a protein and GP5 are probably expressed from the same subgenomic mRNA, via a translation initiation mechanism involving leaky ribosomal scanning. Inactivation of ORF5a expression by reverse genetics yielded a severely crippled EAV mutant, which displayed lower titres and a tiny plaque phenotype. These defects, which could be partially complemented in ORF5a-expressing cells, indicate that the novel protein, which may be the eighth structural protein of arteriviruses, is expressed and important for arterivirus infection

    Identification of novel post-transcriptional features in olfactory receptor family mRNAs.

    Get PDF
    Olfactory receptor (Olfr) genes comprise the largest gene family in mice. Despite their importance in olfaction, how most Olfr mRNAs are regulated remains unexplored. Using RNA-seq analysis coupled with analysis of pre-existing databases, we found that Olfr mRNAs have several atypical features suggesting that post-transcriptional regulation impacts their expression. First, Olfr mRNAs, as a group, have dramatically higher average AU-content and lower predicted secondary structure than do control mRNAs. Second, Olfr mRNAs have a higher density of AU-rich elements (AREs) in their 3'UTR and upstream open reading frames (uORFs) in their 5 UTR than do control mRNAs. Third, Olfr mRNAs have shorter 3' UTR regions and with fewer predicted miRNA-binding sites. All of these novel properties correlated with higher Olfr expression. We also identified striking differences in the post-transcriptional features of the mRNAs from the two major classes of Olfr genes, a finding consistent with their independent evolutionary origin. Together, our results suggest that the Olfr gene family has encountered unusual selective forces in neural cells that have driven them to acquire unique post-transcriptional regulatory features. In support of this possibility, we found that while Olfr mRNAs are degraded by a deadenylation-dependent mechanism, they are largely protected from this decay in neural lineage cells

    Gene expression regulation by upstream open reading frames in rare diseases

    Get PDF
    Upstream open reading frames (uORFs) constitute a class of cis-acting elements that regulate translation initiation. Mutations or polymorphisms that alter, create or disrupt a uORF have been widely associated with several human disorders, including rare diseases. In this mini-review, we intend to highlight the mechanisms associated with the uORF-mediated translational regulation and describe recent examples of their deregulation in the etiology of human rare diseases. Additionally, we discuss new insights arising from ribosome profiling studies and reporter assays regarding uORF features and their intrinsic role in translational regulation. This type of knowledge is of most importance to design and implement new or improved diagnostic and/or treatment strategies for uORF-related human disorders.This work was partially supported by Fundação para a Ciência e a Tecnologia (UID/MULTI/04046/2013 to BioISI from FCT/MCTES/PIDDAC). JS and RF are supported by fellowships from Fundação para a Ciência e a Tecnologia (SFRH/BD/106081/2015 and SFRH/BD/114392/2016, respectively).info:eu-repo/semantics/publishedVersio
    corecore