Search CORE

11,957 research outputs found

DeepSF: deep convolutional neural network for mapping protein sequences to folds

Author: Alfonso Valencia
Altschul
Altschul
Badri Adhikari
Berman
Cao
Chandonia
Cheng
Cheng
Chung
Cui
Damoulas
Dill
Dong
Eickholt
Greene
Hadley
Henikoff
Holm
Jackson
Jianlin Cheng
Jie Hou
Jo
Jo
Kalchbrenner
Kim
Kinch
Kinch
Krizhevsky
Li
Ma
Magnan
McGuffin
Murzin
Shen
Spencer
Srivastava
Söding
Wang
Wang
Wang
Webb
Wei
Xia
Xu
Zhang
Publication venue
Publication date: 03/06/2017
Field of study

Motivation Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homology) comparison to indirectly predict the fold of a tar get protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small number of folds due to methodological limitations, which are not generally useful in practice. Results We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein se quence into one of 1195 known folds, which is useful for both fold recognition and the study of se quence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically extracts fold-related features from a protein sequence of any length and map it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding a classification accuracy of 80.4%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 77.0%. We compare our method with a top profile profile alignment method - HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 14.5%-29.1% higher than HHSearch on template-free modeling targets and 4.5%-16.7% higher on hard template-based modeling targets for top 1, 5, and 10 predicted folds. The hidden features extracted from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.Comment: 28 pages, 13 figure

arXiv.org e-Print Archive

Crossref

University of Missouri, St. Louis

Abundance of intrinsic disorder in SV-IV, a multifunctional androgen-dependent protein secreted from rat seminal vesicle

Author: Ambrosone
Bairoch
Bornberg-Bauer
Bourhis
Cai
Caporale
Cheng
Coeytaux
Csizmók
Doszatányi
Dosztányi
Dunker
Dunker
Dyson
D’Ambrosio
Esposito
Ferron
Gaboriaud
Galzitskaya
Galzitskaya
Galzitskaya
Garbuzynskiy
Harris
Hirose
Ialenti
Kandala
Kyte
Li
Lin
Linding
Linding
Liu
Lupas
MacCallum
McDonald
Metafora
Metafora
Miele
Murzin
Obradovic
Obradovic
Ostrowski
Pan
Prilusky
Quevillon-Cheruel
Radivojac
Ragone
Romero
Romero
Rüping
Shimizu
Shimizu
Sickmeier
Stiuso
Stiuso
Tompa
Tufano
Uversky
Uversky
Vucetic
Ward
Weathers
Weathers
Wolf
Wootton
Wright
Yang
Publication venue
Publication date: 06/12/2007
Field of study

The potent immunomodulatory, anti-inflammatory and procoagulant properties of the
protein no. 4 secreted from the rat seminal vesicle epithelium (SV-IV) have been
previously found to be modulated by a supramolecular monomer-trimer equilibrium.
More structural details that integrate experimental data into a predictive framework
have recently been reported. Unfortunately, homology modelling and fold-recognition
strategies were not successful in creating a theoretical model of the structural
organization of SV-IV. It was inferred that the global structure of SV-IV is not similar
to any protein of known three-dimensional structure. Reversing the classical approach
to the sequence-structure-function paradigm, in this paper we report on novel
information obtained by comparing physicochemical parameters of SV-IV with two
datasets made of intrinsically unfolded and ideally globular proteins. In addition, we
have analysed the SV-IV sequence by several publicly available disorder-oriented
predictors. Overall, disorder predictions and a re-examination of existing experimental
data strongly suggest that SV-IV needs large plasticity to efficiently interact with the
different targets that characterize its multifaceted biological function and should be
therefore better classified as an intrinsically disordered protein

CiteSeerX

Crossref

Archivio della Ricerca - Università di Salerno

Nature Precedings

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

Author: Li Zhen
Sun Siqi
Wang Sheng
Xu Jinbo
Zhang Renyu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 27/11/2016
Field of study

Recently exciting progress has been made on protein contact prediction, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual networks. This deep neural network allows us to model very complex sequence-contact relationship as well as long-range inter-contact correlation. Our method greatly outperforms existing contact prediction methods and leads to much more accurate contact-assisted protein folding. Tested on three datasets of 579 proteins, the average top L long-range prediction accuracy obtained our method, the representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints can yield correct folds (i.e., TMscore>0.6) for 203 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 proteins, respectively. Further, our contact-assisted models have much better quality than template-based models. Using our predicted contacts as restraints, we can (ab initio) fold 208 of the 398 membrane proteins with TMscore>0.5. By contrast, when the training proteins of our method are used as templates, homology modeling can only do so for 10 of them. One interesting finding is that even if we do not train our prediction models with any membrane proteins, our method works very well on membrane protein prediction. Finally, in recent blind CAMEO benchmark our method successfully folded 5 test proteins with a novel fold

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

FigShare

Landsat Satellite Image Segmentation Using the Fuzzy ARTMAP Neural Network

Author: Asofur Yousif R.
Carpenter Gail A.
grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/1995
Field of study

This application illustrates how the fuzzy ARTMAP neural network can be used to monitor environmental changes. A benchmark problem seeks to classify regions of a Landsat image into six soil and crop classes based on images from four spectral sensors. Simulations show that fuzzy ARTMAP outperforms fourteen other neural network and machine learning algorithms. Only the k-Nearest-Neighbor algorithm shows better performance (91% vs. 89%) but without any code compression, while fuzzy ARTMAP achieves a code compression ratio of 6:1. Even with a code compression ratio of 50:1 fuzzy ARTMAP still maintains good performance (83%). This example shows how fuzzy ARTMAP can combine accuracy and code compression in real-world applications.Office of Naval Research (N00014-92-J-401J, N00014-91-J-4100, N00014-92-J-4015); National Science Foundation (IRI 90-00530

Boston University Institutional Repository (OpenBU)

Landsat Satellite Image Segmentation Using the Fuzzy ARTMAP Neural Network

Author: Asofur Yousif R.
Carpenter Gail A.
grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/1995
Field of study

Boston University Institutional Repository (OpenBU)

Automated Protein Structure Classification: A Survey

Author: Hassanzadeh Oktie
Publication venue
Publication date: 01/01/2008
Field of study

Classification of proteins based on their structure provides a valuable resource for studying protein structure, function and evolutionary relationships. With the rapidly increasing number of known protein structures, manual and semi-automatic classification is becoming ever more difficult and prohibitively slow. Therefore, there is a growing need for automated, accurate and efficient classification methods to generate classification databases or increase the speed and accuracy of semi-automatic techniques. Recognizing this need, several automated classification methods have been developed. In this survey, we overview recent developments in this area. We classify different methods based on their characteristics and compare their methodology, accuracy and efficiency. We then present a few open problems and explain future directions.Comment: 14 pages, Technical Report CSRG-589, University of Toront

arXiv.org e-Print Archive

CiteSeerX