Search CORE

15,359 research outputs found

Motif-Based Protein Sequence Classification Using Neural Networks

Author: Aristidis Likas
Bailey T.L.
Br
Dempster A.P.
Dimitrios I. Fotiadis
Foresse F.D.
Horn F.
Hughey R.
Konstantinos Blekas
Ma Q.
Wu C.H.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

DeepSig: Deep learning improves signal peptide detection in proteins

Author: Abadi
Alfonso Valencia
Alipanahi
Bach
Bagos
Berks
Castrense Savojardo
Chollet
Fariselli
Indio
Krizhevsky
Käll
Käll
LeCun
Martoglio
Montavon
Nugent
Petersen
Pier Luigi Martelli
Piero Fariselli
Reynolds
Rita Casadio
Savojardo
Savojardo
Simonyan
Szegedy
Tsirigos
Viklund
von Heijne
Zhou
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Motivation: The identification of signal peptides in protein sequences is an important step toward protein localization and function characterization. Results: Here, we present DeepSig, an improved approach for signal peptide detection and cleavage-site prediction based on deep learning methods. Comparative benchmarks performed on an updated independent dataset of proteins show that DeepSig is the current best performing method, scoring better than other available state-of-the-art approaches on both signal peptide detection and precise cleavage-site identification. Availability and implementation: DeepSig is available as both standalone program and web server at https://deepsig.biocomp.unibo.it. All datasets used in this study can be obtained from the same website

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova

Institutional Research Information System University of Turin

Convolutional LSTM Networks for Subcellular Localization of Proteins

Author: A Graves
A Höglund
A Prlić
C Magnan
G Dahl
HY Xiong
LJP Maaten Van Der
M Schuster
MCF Thomsen
O Emanuelsson
P Baldi
P Lena Di
S Briesemeister
S Henikoff
S Hochreiter
SF Altschul
T Blum
T Goldberg
T Petersen
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biological relevant knowledge from the LSTM networks

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Online Research Database In Technology

Recommended from our members

TITER: predicting translation initiation sites by deep learning.

Author: Hu Hailin
Jiang Tao
Zeng Jianyang
Zhang Lei
Zhang Sai
Publication venue: eScholarship, University of California
Publication date: 01/07/2017
Field of study

MotivationTranslation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict TISs and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g. GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification.MethodsWe have developed a deep learning-based framework, named TITER, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TITER extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework.ResultsExtensive tests demonstrated that TITER can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TITER was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TITER prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames on gene expression and the mutational effects influencing translation initiation efficiency.Availability and implementationTITER is available as an open-source software and can be downloaded from https://github.com/zhangsaithu/titer [email protected] or [email protected] informationSupplementary data are available at Bioinformatics online

eScholarship - University of California