321 research outputs found

    Computational approach for calculating the probability of eukaryotic translation initiation from ribo-seq data that takes into account leaky scanning

    Get PDF
    BACKGROUND: Ribosome profiling (ribo-seq) provides experimental data on the density of elongating or initiating ribosomes at the whole transcriptome level that can be potentially used for estimating absolute levels of translation initiation at individual Translation Initiation Sites (TISs). These absolute levels depend on the mutual organisation of TISs within individual mRNAs. For example, according to the leaky scanning model of translation initiation in eukaryotes, a strong TIS downstream of another strong TIS is unlikely to be productive, since only a few scanning ribosomes would be able to reach the downstream TIS. In order to understand the dependence of translation initiation efficiency on the surrounding nucleotide context, it is important to estimate the strength of TISs independently of their mutual organisation, i.e. to estimate with what probability a ribosome would initiate at a particular TIS. RESULTS: We designed a simple computational approach for estimating the probabilities of ribosomes initiating at individual start codons using ribosome profiling data. The method is based on the widely accepted leaky scanning model of translation initiation in eukaryotes which postulates that scanning ribosomes may skip a start codon if the initiation context is unfavourable and continue on scanning. We tested our approach on three independent ribo-seq datasets obtained in mammalian cultured cells. CONCLUSIONS: Our results suggested that the method successfully discriminates between weak and strong TISs and that the majority of numerous non-AUG TISs reported recently are very weak. Therefore the high frequency of non-AUG TISs observed in ribosome profiling experiments is due to their proximity to mRNA 5′-ends rather than their strength. Detectable translation initiation at non-AUG codons downstream of AUG codons is comparatively infrequent. The leaky scanning method will be useful for the characterization of differences in start codon selection between tissues, developmental stages and in response to stress condition

    Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

    Get PDF
    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5′ cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized—both for increased coding capacity and potentially also for novel regulatory mechanisms—remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5′ untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data

    PausePred and Rfeet: webtools for inferring ribosome pauses and visualizing footprint density from ribosome profiling data

    Get PDF
    The process of translation is characterized by irregularities in the local decoding rates of specific mRNA codons. This includes the occurrences of long pauses that can take place when ribosomes decode certain peptide sequences, encounter strong RNA secondary structures, or decode "hungry" codons. Examples are known where such pausing or stalling is used for regulating protein synthesis. This can be achieved at the level of translation via direct alteration of ribosome progression through mRNA or by altering mRNA stability via NoGo decay. Ribosome pausing has also been implicated in the cotranslational folding of proteins. Ribosome profiling data often are used for inferring the locations of ribosome pauses. However, no dedicated online software is available for this purpose. Here we present PausePred (https://pausepred. ucc. ie/), which can be used to infer ribosome pauses from ribosome profiling (Ribo-seq) data. Peaks of ribosome footprint density are scored based on their magnitude relative to the background density within the surrounding area. The scoring allows the comparison of peaks across the transcriptome or genome. In addition to the score, PausePred reports the coordinates of the pause, the footprint density at the pause site, and the surrounding nucleotide sequence. The pauses can be visualized in the context of Ribo-seq and RNA-seq density plots generated for specific transcripts or genomic regions with the Rfeet tool. PausePred does not require input on the location of protein coding ORFs (although gene annotations can be optionally supplied). As a result, it can be used universally and its output does not depend on ever evolving annotations

    GWIPS-viz: 2018 update

    Get PDF
    The GWIPS-viz browser (http://gwips.ucc.ie/) is an on-line genome browser which is tailored for exploring ribosome profiling (Ribo-seq) data. Since its publication in 2014, GWIPS-viz provides Ribo-seq data for an additional 14 genomes bringing the current total to 23. The integration of new Ribo-seq data has been automated thereby increasing the number of available tracks to 1792, a 10-fold increase in the last three years. The increase is particularly substantial for data derived from human sources. Following user requests, we added the functionality to download these tracks in bigWig format. We also incorporated new types of data (e.g. TCP-seq) as well as auxiliary tracks from other sources that help with the interpretation of Ribo-seq data. Improvements in the visualization of the data have been carried out particularly for bacterial genomes where the Ribo-seq data are now shown in a strand specific manner. For higher eukaryotic datasets, we provide characteristics of individual datasets using the RUST program which includes the triplet periodicity, sequencing biases and relative inferred A-site dwell times. This information can be used for assessing the quality of Ribo-seq datasets. To improve the power of the signal, we aggregate Ribo-seq data from several studies into Global aggregate tracks for each genome

    Computational methods for ribosome profiling data analysis

    Get PDF
    Since the introduction of the ribosome profiling technique in 2009 its popularity has greatly increased. It is widely used for the comprehensive assessment of gene expression and for studying the mechanisms of regulation at the translational level. As the number of ribosome profiling datasets being produced continues to grow, so too does the need for reliable software that can provide answers to the biological questions it can address. This review describes the computational methods and tools that have been developed to analyze ribosome profiling data at the different stages of the process. It starts with initial routine processing of raw data and follows with more specific tasks such as the identification of translated open reading frames, differential gene expression analysis, or evaluation of local or global codon decoding rates. The review pinpoints challenges associated with each step and explains the ways in which they are currently addressed. In addition it provides a comprehensive, albeit incomplete, list of publicly available software applicable to each step, which may be a beneficial starting point to those unexposed to ribosome profiling analysis. The outline of current challenges in ribosome profiling data analysis may inspire computational biologists to search for novel, potentially superior, solutions that will improve and expand the bioinformatician's toolbox for ribosome profiling data analysis

    The GWIPS-viz browser

    Get PDF
    GWIPS-viz is a publicly available browser that provides Genome Wide Information on Protein Synthesis through the visualization of ribosome profiling data. Ribosome profiling (Ribo-seq) is a high-throughput technique which isolates fragments of messenger RNA that are protected by the ribosome. The alignment of the ribosome-protected fragments or footprint sequences to the corresponding reference genome and their visualization using GWIPS-viz allows for unique insights into the genome loci that are expressed as potentially translated RNA. The GWIPS-viz browser hosts both Ribo-seq data and corresponding mRNA-seq data from publicly available studies across a number of genomes, avoiding the need for computational processing on the user side. Since its initial publication in 2014, over 1885 tracks have been produced across 24 genomes. This unit describes the navigation of the GWIPS-viz genome browser, the uploading of custom tracks, and the downloading of the Ribo-seq/mRNA-seq alignment data

    Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods

    Get PDF
    We set out to estimate net survival trends for 10 common cancers in 279 cancer registry populations in 67 countries around the world, as part of the CONCORD-2 study. Net survival can be interpreted as the proportion of cancer patients who survive up to a given time, after eliminating the impact of mortality from other causes (background mortality). Background mortality varies widely between populations and over time. It was therefore necessary to construct robust life tables that accurately reflected the background mortality in each of the registry populations. Life tables of all-cause mortality rates by single year of age and sex were constructed by calendar year for each population and, when possible, by racial or ethnic sub-groups. We used three different approaches, based on the type of mortality data available from each registry. With death and population counts, we adopted a flexible multivariable modelling approach. With unsmoothed mortality rates, we used the Ewbank relational method. Where no data were available from the registry or a national statistical office, we used the abridged UN Population Division life tables and interpolated these using the Elandt-Johnson method. We also investigated the impact of using state- and race-specific life tables versus national race-specific life tables on estimates of net survival from four adult cancers in the United States (US)

    Trips-Viz: an environment for the analysis of public and user-generated ribosome profiling data.

    Get PDF
    Trips-Viz (https://trips.ucc.ie/) is an interactive platform for the analysis and visualization of ribosome profiling (Ribo-Seq) and shotgun RNA sequencing (RNA-seq) data. This includes publicly available and user generated data, hence Trips-Viz can be classified as a database and as a server. As a database it provides access to many processed Ribo-Seq and RNA-seq data aligned to reference transcriptomes which has been expanded considerably since its inception. Here, we focus on the server functionality of Trips-viz which also has been greatly improved. Trips-viz now enables visualisation of proteomics data from a large number of processed mass spectrometry datasets. It can be used to support translation inferred from Ribo-Seq data. Users are now able to upload a custom reference transcriptome as well as data types other than Ribo-Seq/RNA-Seq. Incorporating custom data has been streamlined with RiboGalaxy (https://ribogalaxy.ucc.ie/) integration. The other new functionality is the rapid detection of translated open reading frames (ORFs) through a simple easy to use interface. The analysis of differential expression has been also improved via integration of DESeq2 and Anota2seq in addition to a number of other improvements of existing Trips-viz features

    GWIPS-viz: development of a ribo-seq genome browser

    Get PDF
    We describe the development of GWIPS-viz (http://gwips.ucc.ie), an online genome browser for viewing ribosome profiling data. Ribosome profiling (ribo-seq) is a recently developed technique that provides genome-wide information on protein synthesis (GWIPS) in vivo. It is based on the deep sequencing of ribosome-protected messenger RNA (mRNA) fragments, which allows the ribosome density along all mRNA transcripts present in the cell to be quantified. Since its inception, ribo-seq has been carried out in a number of eukaryotic and prokaryotic organisms. Owing to the increasing interest in ribo-seq, there is a pertinent demand for a dedicated ribo-seq genome browser. GWIPS-viz is based on The University of California Santa Cruz (UCSC) Genome Browser. Ribo-seq tracks, coupled with mRNA-seq tracks, are currently available for several genomes: human, mouse, zebrafish, nematode, yeast, bacteria (Escherichia coli K12, Bacillus subtilis), human cytomegalovirus and bacteriophage lambda. Our objective is to continue incorporating published ribo-seq data sets so that the wider community can readily view ribosome profiling information from multiple studies without the need to carry out computational processing
    corecore