Search CORE

3,935 research outputs found

TRAPID : an efficient online tool for the functional and comparative analysis of de novo RNA-Seq transcriptomes

Author: Deforce Dieter
Proost Sebastian
Van Bel Michiel
Van de Peer Yves
Van Neste Christophe
Vandepoele Klaas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Transcriptome analysis through next-generation sequencing technologies allows the generation of detailed gene catalogs for non-model species, at the cost of new challenges with regards to computational requirements and bioinformatics expertise. Here, we present TRAPID, an online tool for the fast and efficient processing of assembled RNA-Seq transcriptome data, developed to mitigate these challenges. TRAPID offers high-throughput open reading frame detection, frameshift correction and includes a functional, comparative and phylogenetic toolbox, making use of 175 reference proteomes. Benchmarking and comparison against state-of-the-art transcript analysis tools reveals the efficiency and unique features of the TRAPID system

Springer - Publisher Connector

UPSpace at the University of Pretoria

From Structure Prediction to Genomic Screens for Novel Non-Coding RNAs

Author: A Ben-Hur
AF Bompfünewerer
AM Khalil
AO Harmanci
AR Gruber
AV Uzilov
AX Wang
B Knudsen
B Lewis
BW Matthews
C Warden
C Workman
D Guarnieri
D Mathews
D Sankoff
DH Mathews
DH Turner
DK Chiu
E Bonnet
E Nudler
E Rivas
E Rivas
E Rivas
E Torarinsson
E Torarinsson
EP Nawrocki
EP Nawrocki
ES Andersen
ES Andersen
F Sleutels
GardnerJPP Daub
H Jia
I Holmes
I Holmes
IL Hofacker
Ivo L. Hofacker
J Felsenstein
J Gorodkin
J Gorodkin
J Gorodkin
J Gorodkin
J Gorodkin
J Gorodkin
J Gorodkin
Jan Gorodkin
JC Ellis
JG Underwood
JH Havgaard
JM Watts
JP McCutcheon
JS Mattick
JS Pedersen
JW Brown
K Doshi
K Okamura
K Reiche
KC Wang
KE Deigan
KM Weeks
L Redrup
M Georges
M Guttman
M Kertesz
M Kertesz
M Lindow
M Xie
MB Gerstein
MC Tsai
Michael Levitt
MW Hentze
N Lau
P Anandam
P Clote
P Gardner
P Larsson
P Menzel
P Schattner
PG Hawkins
PN Seibel
PP Gardner
R Nussinov
RA Gupta
RD Dowell
RD Dowell
RJ Klein
RJ Klein
RM Kuhn
RR Gutell
RR Gutell
S Eddy
S Griffiths-Jones
S Siebert
S Washietl
S Washietl
S Washietl
S Will
SE Seemann
SF Altschul
SR Eddy
T Gesell
T Hung
T Lowe
T Nagano
TF Consortium
TJ Macke
UA Ørom
V Kim
V Tripathi
W Deng
W Filipowicz
W Fontana
Y Park
Y Sakakibara
Z Weinberg
Z Weinberg
Z Yao
Z Yao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other

Directory of Open Access Journals

Copenhagen University Research Information System

Kernel methods in genomics and computational biology

Author: Vert Jean-Philippe
Publication venue
Publication date: 17/10/2005
Field of study

Support vector machines and kernel methods are increasingly popular in genomics and computational biology, due to their good performance in real-world applications and strong modularity that makes them suitable to a wide range of problems, from the classification of tumors to the automatic annotation of proteins. Their ability to work in high dimension, to process non-vectorial data, and the natural framework they provide to integrate heterogeneous data are particularly relevant to various problems arising in computational biology. In this chapter we survey some of the most prominent applications published so far, highlighting the particular developments in kernel methods triggered by problems in biology, and mention a few promising research directions likely to expand in the future

arXiv.org e-Print Archive

Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

Author: Bernhard Schölkopf
Gunnar Rätsch
Hanh Witte
Jagan Srinivasan
Klaus-R Müller
Ralf-J Sommer
Sören Sonnenburg
The Caenorhabditis elegans sequencing consortium
Uwe Ohler
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and improve the accuracy of the genome annotation of the nematode Caenorhabditis elegans. The proposed machine learning system is trained to recognize exons and introns on the unspliced mRNA, utilizing recent advances in support vector machines and label sequence learning. In 87% (coding and untranslated regions) and 95% (coding regions only) of all genes tested in several out-of-sample evaluations, our method correctly identified all exons and introns. Notably, only 37% and 50%, respectively, of the presently unconfirmed genes in the C. elegans genome annotation agree with our predictions, thus we hypothesize that a sizable fraction of those genes are not correctly annotated. A retrospective evaluation of the Wormbase WS120 annotation [1] of C. elegans reveals that splice form predictions on unconfirmed genes in WS120 are inaccurate in about 18% of the considered cases, while our predictions deviate from the truth only in 10%–13%. We experimentally analyzed 20 controversial genes on which our system and the annotation disagree, confirming the superiority of our predictions. While our method correctly predicted 75% of those cases, the standard annotation was never completely correct. The accuracy of our system is further corroborated by a comparison with two other recently proposed systems that can be used for splice form prediction: SNAP and ExonHunter. We conclude that the genome annotation of C. elegans and other organisms can be greatly enhanced using modern machine learning technology

CiteSeerX

Directory of Open Access Journals

Caltech Authors

GENOME INFORMATICS

Author: Birol I.
Cant J.
Champ M.
Publication venue
Publication date: 01/09/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository