Search CORE

Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

Author: Atkins John F.
Baranov Pavel V.
Firth Andrew E.
Ivanov Ivaylo P.
Michel Audrey M.
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5′ cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized—both for increased coding capacity and potentially also for novel regulatory mechanisms—remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5′ untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data

CiteSeerX

PausePred and Rfeet: webtools for inferring ribosome pauses and visualizing footprint density from ribosome profiling data

Author: Baranov Pavel V.
Kumari Romika
Michel Audrey M.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 04/03/2022
Field of study

The process of translation is characterized by irregularities in the local decoding rates of specific mRNA codons. This includes the occurrences of long pauses that can take place when ribosomes decode certain peptide sequences, encounter strong RNA secondary structures, or decode "hungry" codons. Examples are known where such pausing or stalling is used for regulating protein synthesis. This can be achieved at the level of translation via direct alteration of ribosome progression through mRNA or by altering mRNA stability via NoGo decay. Ribosome pausing has also been implicated in the cotranslational folding of proteins. Ribosome profiling data often are used for inferring the locations of ribosome pauses. However, no dedicated online software is available for this purpose. Here we present PausePred (https://pausepred. ucc. ie/), which can be used to infer ribosome pauses from ribosome profiling (Ribo-seq) data. Peaks of ribosome footprint density are scored based on their magnitude relative to the background density within the surrounding area. The scoring allows the comparison of peaks across the transcriptome or genome. In addition to the score, PausePred reports the coordinates of the pause, the footprint density at the pause site, and the surrounding nucleotide sequence. The pauses can be visualized in the context of Ribo-seq and RNA-seq density plots generated for specific transcripts or genomic regions with the Rfeet tool. PausePred does not require input on the location of protein coding ORFs (although gene annotations can be optionally supplied). As a result, it can be used universally and its output does not depend on ever evolving annotations

GWIPS-viz: 2018 update

Author: Baranov Pavel V.
Kiniry Stephen J.
Michel Audrey M.
Mullan James P.
O'Connor Patrick B. F.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/09/2017
Field of study

The GWIPS-viz browser (http://gwips.ucc.ie/) is an on-line genome browser which is tailored for exploring ribosome profiling (Ribo-seq) data. Since its publication in 2014, GWIPS-viz provides Ribo-seq data for an additional 14 genomes bringing the current total to 23. The integration of new Ribo-seq data has been automated thereby increasing the number of available tracks to 1792, a 10-fold increase in the last three years. The increase is particularly substantial for data derived from human sources. Following user requests, we added the functionality to download these tracks in bigWig format. We also incorporated new types of data (e.g. TCP-seq) as well as auxiliary tracks from other sources that help with the interpretation of Ribo-seq data. Improvements in the visualization of the data have been carried out particularly for bacterial genomes where the Ribo-seq data are now shown in a strand specific manner. For higher eukaryotic datasets, we provide characteristics of individual datasets using the RUST program which includes the triplet periodicity, sequencing biases and relative inferred A-site dwell times. This information can be used for assessing the quality of Ribo-seq datasets. To improve the power of the signal, we aggregate Ribo-seq data from several studies into Global aggregate tracks for each genome

Computational methods for ribosome profiling data analysis

Author: Baranov Pavel V.
Kiniry Stephen J.
Michel Audrey M.
Publication venue: 'Wiley'
Publication date: 24/11/2019
Field of study

Since the introduction of the ribosome profiling technique in 2009 its popularity has greatly increased. It is widely used for the comprehensive assessment of gene expression and for studying the mechanisms of regulation at the translational level. As the number of ribosome profiling datasets being produced continues to grow, so too does the need for reliable software that can provide answers to the biological questions it can address. This review describes the computational methods and tools that have been developed to analyze ribosome profiling data at the different stages of the process. It starts with initial routine processing of raw data and follows with more specific tasks such as the identification of translated open reading frames, differential gene expression analysis, or evaluation of local or global codon decoding rates. The review pinpoints challenges associated with each step and explains the ways in which they are currently addressed. In addition it provides a comprehensive, albeit incomplete, list of publicly available software applicable to each step, which may be a beneficial starting point to those unexposed to ribosome profiling analysis. The outline of current challenges in ribosome profiling data analysis may inspire computational biologists to search for novel, potentially superior, solutions that will improve and expand the bioinformatician's toolbox for ribosome profiling data analysis

The GWIPS-viz browser

Author: Baranov Pavel V.
Kiniry Stephen J.
Michel Audrey M.
Publication venue: 'Wiley'
Publication date: 16/05/2018
Field of study

GWIPS-viz is a publicly available browser that provides Genome Wide Information on Protein Synthesis through the visualization of ribosome profiling data. Ribosome profiling (Ribo-seq) is a high-throughput technique which isolates fragments of messenger RNA that are protected by the ribosome. The alignment of the ribosome-protected fragments or footprint sequences to the corresponding reference genome and their visualization using GWIPS-viz allows for unique insights into the genome loci that are expressed as potentially translated RNA. The GWIPS-viz browser hosts both Ribo-seq data and corresponding mRNA-seq data from publicly available studies across a number of genomes, avoiding the need for computational processing on the user side. Since its initial publication in 2014, over 1885 tracks have been produced across 24 genomes. This unit describes the navigation of the GWIPS-viz genome browser, the uploading of custom tracks, and the downloading of the Ribo-seq/mRNA-seq alignment data

Queen's University Belfast Research Portal

Life tables for global surveillance of cancer survival (the CONCORD programme): data sources and methods

Author: Allemani Claudia
Bannon Finian
Bonaventure Audrey
Carreira Helena
Coleman Michel P
Harewood Rhea
Spika Devon
Woods Laura M
Publication venue: Figshare
Publication date: 01/01/2017
Field of study

We set out to estimate net survival trends for 10 common cancers in 279 cancer registry populations in 67 countries around the world, as part of the CONCORD-2 study. Net survival can be interpreted as the proportion of cancer patients who survive up to a given time, after eliminating the impact of mortality from other causes (background mortality). Background mortality varies widely between populations and over time. It was therefore necessary to construct robust life tables that accurately reflected the background mortality in each of the registry populations. Life tables of all-cause mortality rates by single year of age and sex were constructed by calendar year for each population and, when possible, by racial or ethnic sub-groups. We used three different approaches, based on the type of mortality data available from each registry. With death and population counts, we adopted a flexible multivariable modelling approach. With unsmoothed mortality rates, we used the Ewbank relational method. Where no data were available from the registry or a national statistical office, we used the abridged UN Population Division life tables and interpolated these using the Elandt-Johnson method. We also investigated the impact of using state- and race-specific life tables versus national race-specific life tables on estimates of net survival from four adult cancers in the United States (US)

Lund University Publications

Crossref

LSHTM Research Online

Springer - Publisher Connector

LSHTM Data Compass

Trips-Viz: an environment for the analysis of public and user-generated ribosome profiling data.

Author: Baranov Pavel V
Judge Ciara E
Kiniry Stephen J
Michel Audrey M
Publication venue: 'Oxford University Press (OUP)'
Publication date: 02/07/2021
Field of study

Trips-Viz (https://trips.ucc.ie/) is an interactive platform for the analysis and visualization of ribosome profiling (Ribo-Seq) and shotgun RNA sequencing (RNA-seq) data. This includes publicly available and user generated data, hence Trips-Viz can be classified as a database and as a server. As a database it provides access to many processed Ribo-Seq and RNA-seq data aligned to reference transcriptomes which has been expanded considerably since its inception. Here, we focus on the server functionality of Trips-viz which also has been greatly improved. Trips-viz now enables visualisation of proteomics data from a large number of processed mass spectrometry datasets. It can be used to support translation inferred from Ribo-Seq data. Users are now able to upload a custom reference transcriptome as well as data types other than Ribo-Seq/RNA-Seq. Incorporating custom data has been streamlined with RiboGalaxy (https://ribogalaxy.ucc.ie/) integration. The other new functionality is the rapid detection of translated open reading frames (ORFs) through a simple easy to use interface. The analysis of differential expression has been also improved via integration of DESeq2 and Anota2seq in addition to a number of other improvements of existing Trips-viz features

LSHTM Research Online

GWIPS-viz: development of a ribo-seq genome browser

Author: Baranov Pavel V.
De Bo Christof
Donohue Claire A.
Fox Gearoid
Heaphy Stephen M.
Higgins Desmond G.
Kiran Anmol M.
Michel Audrey M.
Mullan James P. A.
O'Connor Patrick B. F.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 31/10/2013
Field of study

We describe the development of GWIPS-viz (http://gwips.ucc.ie), an online genome browser for viewing ribosome profiling data. Ribosome profiling (ribo-seq) is a recently developed technique that provides genome-wide information on protein synthesis (GWIPS) in vivo. It is based on the deep sequencing of ribosome-protected messenger RNA (mRNA) fragments, which allows the ribosome density along all mRNA transcripts present in the cell to be quantified. Since its inception, ribo-seq has been carried out in a number of eukaryotic and prokaryotic organisms. Owing to the increasing interest in ribo-seq, there is a pertinent demand for a dedicated ribo-seq genome browser. GWIPS-viz is based on The University of California Santa Cruz (UCSC) Genome Browser. Ribo-seq tracks, coupled with mRNA-seq tracks, are currently available for several genomes: human, mouse, zebrafish, nematode, yeast, bacteria (Escherichia coli K12, Bacillus subtilis), human cytomegalovirus and bacteriophage lambda. Our objective is to continue incorporating published ribo-seq data sets so that the wider community can readily view ribosome profiling information from multiple studies without the need to carry out computational processing

CiteSeerX