Search CORE

7,318 research outputs found

Recommended from our members

TITER: predicting translation initiation sites by deep learning.

Author: Hu Hailin
Jiang Tao
Zeng Jianyang
Zhang Lei
Zhang Sai
Publication venue: eScholarship, University of California
Publication date: 01/07/2017
Field of study

MotivationTranslation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict TISs and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g. GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification.MethodsWe have developed a deep learning-based framework, named TITER, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TITER extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework.ResultsExtensive tests demonstrated that TITER can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TITER was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TITER prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames on gene expression and the mutational effects influencing translation initiation efficiency.Availability and implementationTITER is available as an open-source software and can be downloaded from https://github.com/zhangsaithu/titer [email protected] or [email protected] informationSupplementary data are available at Bioinformatics online

eScholarship - University of California

Translation initiation site prediction on a genomic scale : beauty in simplicity

Author: Borodovsky
Delcher
Fickett
Hatzigeorgiou
Kozak
Kozak
Kozak
Li
Li
Li
Liu
Nishikawa
Pedersen
Salamov
Salzberg
Salzberg
Sven Degroeve
Thomas Abeel
Tiwari
Wang
Yvan Saeys
Yves Van de Peer
Zeng
Zien
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2007
Field of study

Motivation: The correct identification of translation initiation sites (TIS) remains a challenging problem for computational methods that automatically try to solve this problem. Furthermore, the lion's share of these computational techniques focuses on the identification of TIS in transcript data. However, in the gene prediction context the identification of TIS occurs on the genomic level, which makes things even harder because at the genome level many more pseudo-TIS occur, resulting in models that achieve a higher number of false positive predictions. Results: In this article, we evaluate the performance of several 'simple' TIS recognition methods at the genomic level, and compare them to state-of-the-art models for TIS prediction in transcript data. We conclude that the simple methods largely outperform the complex ones at the genomic scale, and we propose a new model for TIS recognition at the genome level that combines the strengths of these simple models. The new model obtains a false positive rate of 0.125 at a sensitivity of 0.80 on a well annotated human chromosome ( chromosome 21). Detailed analyses show that the model is useful, both on its own and in a simple gene prediction setting

Crossref

Ghent University Academic Bibliography

AUG_hairpin: prediction of a downstream secondary structure influencing the recognition of a translation start site

Author: Akinori Sarai
Alex V Kochetov
Andrey Palyanov
AV Kochetov
AV Kochetov
AV Kochetov
AV Pisarev
C Touriol
Dmitry Grigorovich
HS Kwon
I Ventoso
IB Rogozin
Igor I Titov
IL Hofacker
IM Meyer
JL Riechmann
JS McCaskill
K Clyde
K Takahashi
K-N Zhao
L Yang
M Ciullo
M Kozak
M Kozak
M Kozak
M Lukaszewicz
M Nguyen
Nikolay A Kolchanov
RJ Jackson
SA Shabalina
SA Shabalina
SD Baird
SV Sawant
W-L Hwang
Y Kobayashi
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. The recognition of the start AUG codon by eukaryotic ribosomes is considered to depend on its nucleotide context. However, the fraction of eukaryotic mRNAs with the start codon in a suboptimal context is relatively large. It may be expected that mRNA should possess some features providing efficient translation, including the proper recognition of a translation start site. It has been experimentally shown that a downstream hairpin located in certain positions with respect to start codon can compensate in part for the suboptimal AUG context and also increases translation from non-AUG initiation codons. Prediction of such a compensatory hairpin may be useful in the evaluation of eukaryotic mRNA translation properties. Results We evaluated interdependency between the start codon context and mRNA secondary structure at the CDS beginning: it was found that a suboptimal start codon context significantly correlated with higher base pairing probabilities at positions 13 – 17 of CDS of human and mouse mRNAs. It is likely that the downstream hairpins are used to enhance translation of some mammalian mRNAs <it>in vivo</it>. Thus, we have developed a tool, <it>AUG_hairpin</it>, to predict local stem-loop structures located within the defined region at the beginning of mRNA coding part. The implemented algorithm is based on the available published experimental data on the CDS-located stem-loop structures influencing the recognition of upstream start codons. Conclusion An occurrence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) may provide researchers with a testable assumption on the presence of additional regulatory signal influencing mRNA translation initiation rate and the start codon choice. <it>AUG_hairpin</it>, which has a convenient Web-interface with adjustable parameters, will make such an evaluation easy and efficient.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Evolution and Mechanics of Translational Control in Plants

Author: Vaughn Justin N.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2011
Field of study

The expression of numerous plant mRNAs is attenuated by RNA sequence elements located in the 5\u27 and 3\u27 untranslated regions (UTRs). For example, in plants and many higher eukaryotes, roughly 35% of genes encode mRNAs that contain one or more upstream open reading frames (uORFs) in the 5\u27 UTR. For this dissertation I have analyzed the pattern of conservation of such mRNA sequence elements. In the first set of studies, I have taken a comparative transcriptomics approach to address which RNA sequence elements are conserved between various families of angiosperm plants. Such conservation indicates an element\u27s fundamental importance to plant biology, points to pathways for which it is most vital, and suggests the mechanism by which it acts. Conserved motifs were detected in 3% of genes. These include di-purine repeat motifs, uORF-associated motifs, putative binding sites for PUMILIO-like RNA binding proteins, small RNA targets, and a wide range of other sequence motifs. Due to the scanning process that precedes translation initiation, uORFs are often translated, thereby repressing initiation at the an mRNA\u27s main ORF. As one might predict, I found a clear bias against the AUG start codon within the 5\u27 untranslated region (5\u27 UTR) among all plants examined. Further supporting this finding, comparative analysis indicates that, for ~42% of genes, AUGs and their resultant uORFs reduce carrier fitness. Interestingly, for at least 5% of genes, uORFs are not only tolerated, but enriched. The remaining uORFs appear to be neutral. Because of their tangible impact on plant biology, it is critical to differentiate how uORFs affect translation and how, in many cases, their inhibitory effects are neutralized. In pursuit of this aim, I developed a computational model of the initiation process that uses five parameters to account for uORF presence. In vivo translation efficiency data from uORF-containing reporter constructs were used to estimate the model\u27s parameters in wild type Arabidopsis. In addition, the model was applied to identify salient defects associated with a mutation in the subunit h of eukaryotic initiation factor 3 (eIF3h). The model indicates that eIF3h, by supporting re-initation during uORF elongation, facilitates uORF tolerance

University of Tennessee, Knoxville: Trace

Gcn4p and novel upstream activating sequences regulate targets of the unfolded protein response.

Author: Li Hao
Patil Christopher K
Walter Peter
Publication venue: eScholarship, University of California
Publication date: 01/08/2004
Field of study

Eukaryotic cells respond to accumulation of unfolded proteins in the endoplasmic reticulum (ER) by activating the unfolded protein response (UPR), a signal transduction pathway that communicates between the ER and the nucleus. In yeast, a large set of UPR target genes has been experimentally determined, but the previously characterized unfolded protein response element (UPRE), an upstream activating sequence (UAS) found in the promoter of the UPR target gene KAR2, cannot account for the transcriptional regulation of most genes in this set. To address this puzzle, we analyzed the promoters of UPR target genes computationally, identifying as candidate UASs short sequences that are statistically overrepresented. We tested the most promising of these candidate UASs for biological activity, and identified two novel UPREs, which are necessary and sufficient for UPR activation of promoters. A genetic screen for activators of the novel motifs revealed that the transcription factor Gcn4p plays an essential and previously unrecognized role in the UPR: Gcn4p and its activator Gcn2p are required for induction of a majority of UPR target genes during ER stress. Both Hac1p and Gcn4p bind target gene promoters to stimulate transcriptional induction. Regulation of Gcn4p levels in response to changing physiological conditions may function as an additional means to modulate the UPR. The discovery of a role for Gcn4p in the yeast UPR reveals an additional level of complexity and demonstrates a surprising conservation of the signaling circuit between yeast and metazoan cells

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The mRNA-bound proteome of the human malaria parasite Plasmodium falciparum.

Author: Batugedara Gayani
Bunnik Evelien M
Florens Laurence
Le Roch Karine G
Prudhomme Jacques
Saraf Anita
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

BackgroundGene expression is controlled at multiple levels, including transcription, stability, translation, and degradation. Over the years, it has become apparent that Plasmodium falciparum exerts limited transcriptional control of gene expression, while at least part of Plasmodium's genome is controlled by post-transcriptional mechanisms. To generate insights into the mechanisms that regulate gene expression at the post-transcriptional level, we undertook complementary computational, comparative genomics, and experimental approaches to identify and characterize mRNA-binding proteins (mRBPs) in P. falciparum.ResultsClose to 1000 RNA-binding proteins are identified by hidden Markov model searches, of which mRBPs encompass a relatively large proportion of the parasite proteome as compared to other eukaryotes. Several abundant mRNA-binding domains are enriched in apicomplexan parasites, while strong depletion of mRNA-binding domains involved in RNA degradation is observed. Next, we experimentally capture 199 proteins that interact with mRNA during the blood stages, 64 of which with high confidence. These captured mRBPs show a significant overlap with the in silico identified candidate RBPs (p < 0.0001). Among the experimentally validated mRBPs are many known translational regulators active in other stages of the parasite's life cycle, such as DOZI, CITH, PfCELF2, Musashi, and PfAlba1-4. Finally, we also detect several proteins with an RNA-binding domain abundant in Apicomplexans (RAP domain) that is almost exclusively found in apicomplexan parasites.ConclusionsCollectively, our results provide the most complete comparative genomics and experimental analysis of mRBPs in P. falciparum. A better understanding of these regulatory proteins will not only give insight into the intricate parasite life cycle but may also provide targets for novel therapeutic strategies

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Unsupervised and semi-supervised training methods for eukaryotic gene prediction

Author: Ter-Hovhannisyan Vardges
Publication venue: Georgia Institute of Technology
Publication date: 17/11/2008
Field of study

This thesis describes new gene finding methods for eukaryotic gene prediction. The current methods for deriving model parameters for gene prediction algorithms are based on curated or experimentally validated set of genes or gene elements. These training sets often require time and additional expert efforts especially for the species that are in the initial stages of genome sequencing. Unsupervised training allows determination of model parameters from anonymous genomic sequence with. The importance and the practical applicability of the unsupervised training is critical for ever growing rate of eukaryotic genome sequencing. Three distinct training procedures are developed for diverse group of eukaryotic species. GeneMark-ES is developed for species with strong donor and acceptor site signals such as Arabidopsis thaliana, Caenorhabditis elegans and Drosophila melanogaster. The second version of the algorithm, GeneMark-ES-2, introduces enhanced intron model to better describe the gene structure of fungal species with posses with relatively weak donor and acceptor splice sites and well conserved branch point signal. GeneMark-LE, semi-supervised training approach is designed for eukaryotic species with small number of introns. The results indicate that the developed unsupervised training methods perform well as compared to other training methods and as estimated from the set of genes supported by EST-to-genome alignments. Analysis of novel genomes reveals interesting biological findings and show that several candidates of under-annotated and over-annotated fungal species are present in the current set of annotated of fungal genomes.Ph.D.Committee Chair: Mark Borodovky; Committee Member: Jung H. Choi; Committee Member: King Jordan; Committee Member: Leonid Bunimovich; Committee Member: Yury Chernof

Scholarly Materials And Research @ Georgia Tech

MetWAMer: eukaryotic translation initiation site prediction

Author: A Delcher
A Hatzigeorgiou
A Nadershahi
A Pedersen
A Prats
A Rakotondrafara
A Sachs
A Salamov
A Zien
C Bishop
C Iseli
C Lottaz
C Mathé
D Abramczyk
D Cavener
E Birney
G Crooks
G Gremme
G Li
G Stormo
H Li
H Liu
H Liu
J Allen
J Allen
J Crow
L Balvay
L Xing
M de Hoon
M Hirosawa
M Kozak
M Kozak
M Kozak
M Kozak
M Medveczky
M Sparks
M Sparks
M Stanke
M Stanke
M Tech
M Tech
Michael E Sparks
Q Dong
S Altschul
S Hebsgaard
S Russell
S Salzberg
T Berardini
T Mitchell
T Nishikawa
T Preiss
T Schiex
T Schneider
T Sing
V Brendel
Volker Brendel
Y Saeys
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Translation initiation site (TIS) identification is an important aspect of the gene annotation process, requisite for the accurate delineation of protein sequences from transcript data. We have developed the MetWAMer package for TIS prediction in eukaryotic open reading frames of non-viral origin. MetWAMer can be used as a stand-alone, third-party tool for post-processing gene structure annotations generated by external computational programs and/or pipelines, or directly integrated into gene structure prediction software implementations. Results MetWAMer currently implements five distinct methods for TIS prediction, the most accurate of which is a routine that combines weighted, signal-based translation initiation site scores and the contrast in coding potential of sequences flanking TISs using a perceptron. Also, our program implements clustering capabilities through use of the <it>k</it>-medoids algorithm, thereby enabling cluster-specific TIS parameter utilization. In practice, our static weight array matrix-based indexing method for parameter set lookup can be used with good results in data sets exhibiting moderate levels of 5'-complete coverage. Conclusion We demonstrate that improvements in statistically-based models for TIS prediction can be achieved by taking the class of each potential start-methionine into account pending certain testing conditions, and that our perceptron-based model is suitable for the TIS identification task. MetWAMer represents a well-documented, extensible, and freely available software system that can be readily re-trained for differing target applications and/or extended with existing and novel TIS prediction methods, to support further research efforts in this area.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Ribosomes from Trypanosomatids: Unique Structural and Functional Properties

Author: Hoebeke Johan
J. Lapadula Walter
Juri Ayub Maximiliano
Smulski Cristian R.
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

Trypanosomatids are a monophyletic group of protozoa that diverged early from the eukaryotic lineage, constituting valuable model organisms for studying variability in different highly conserved processes including protein synthesis. Moreover, several species of trypanosomatids are causing agents of endemic diseases in the third world. There are many evidences suggesting that translation in these organisms shows important differences with that of model organisms such as yeast and mammals. These unique features, which have a great potential relevance for both basic and applied research, will be discussed in this chapter.Fil: Juri Ayub, Maximiliano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto Multidisciplinario de Investigaciones Biológicas de San Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Instituto Multidisciplinario de Investigaciones Biológicas de San Luis; ArgentinaFil: Lapadula, Walter Jesús. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis. Instituto Multidisciplinario de Investigaciones Biológicas de San Luis. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Instituto Multidisciplinario de Investigaciones Biológicas de San Luis; ArgentinaFil: Hoebeke, Johan. Centre National de la Recherche Scientifique; FranciaFil: Smulski, Cristian Roberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones en Ingeniería Genética y Biología Molecular "Dr. Héctor N. Torres"; Argentina. Centre National de la Recherche Scientifique; Franci

IntechOpen

CONICET Digital