Search CORE

57,477 research outputs found

Protein Inter-Residue Distance Prediction Using Residual and Capsule Networks

Author: Dillon Andrew
Publication venue: IRL @ UMSL
Publication date: 16/10/2019
Field of study

The protein folding problem, also known as protein structure prediction, is the task of building three-dimensional protein models given their one-dimensional amino acid sequence. New methods that have been successfully used in the most recent CASP challenge have demonstrated that predicting a protein\u27s inter-residue distances is key to solving this problem. Various deep learning algorithms including fully convolutional neural networks and residual networks have been developed to solve the distance prediction problem. In this work, we develop a hybrid method based on residual networks and capsule networks. We demonstrate that our method can predict distances more accurately than the algorithms used in the state-of-the-art methods. Using a standard dataset of 3420 training proteins and an independent dataset of 150 test proteins, we show that our method can predict distances 51.06% more accurately than a standard residual network method, when accuracy of all long-range distances are evaluated using mean absolute error. To further validate our results, we demonstrate that three-dimensional models built using the distances predicted by our method are more accurate than models built using the distances predicted by residual networks. Overall, our results, for the first time, highlight the potential of capsule-residual hybrid networks for solving the protein inter-residue distance prediction problem

University of Missouri, St. Louis

Protein subcellular localization prediction based on compartment-specific features and structure conservation

Author: Allan Lo
Emily Su
Hua-Sheng Chiu
Jenn-Kang Hwang
Ting-Yi Sung
Wen-Lian Hsu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

BACKGROUND: Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. RESULTS: We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. CONCLUSION: Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant improvement. The biological features are interpretable and can be applied in advanced analyses and experimental designs. Moreover, the overall accuracy of combining the structural homology approach is further improved, which suggests that structural conservation could be a useful indicator for inferring localization in addition to sequence homology. The proposed method can be used in large-scale analyses of proteomes

Crossref

Springer - Publisher Connector

PubMed Central

A Hybrid Monte Carlo Ant Colony Optimization Approach for Protein Structure Prediction in the HP Model

Author: Citrolo Andrea G.
Mauri Giancarlo
Publication venue: 'Open Publishing Association'
Publication date: 01/09/2013
Field of study

The hydrophobic-polar (HP) model has been widely studied in the field of protein structure prediction (PSP) both for theoretical purposes and as a benchmark for new optimization strategies. In this work we introduce a new heuristics based on Ant Colony Optimization (ACO) and Markov Chain Monte Carlo (MCMC) that we called Hybrid Monte Carlo Ant Colony Optimization (HMCACO). We describe this method and compare results obtained on well known HP instances in the 3 dimensional cubic lattice to those obtained with standard ACO and Simulated Annealing (SA). All methods were implemented using an unconstrained neighborhood and a modified objective function to prevent the creation of overlapping walks. Results show that our methods perform better than the other heuristics in all benchmark instances.Comment: In Proceedings Wivace 2013, arXiv:1309.712

arXiv.org e-Print Archive

Directory of Open Access Journals

Predicting protein-protein interactions as a one-class classification problem

Author: Alashwal Hany
Deris Safaai
Othman Razib M.
Publication venue
Publication date: 01/01/2006
Field of study

Protein-protein interactions represent a key step in understanding proteins functions. This is due to the fact that proteins usually work in context of other proteins and rarely function alone. Machine learning techniques have been used to predict protein-protein interactions. However, most of these techniques address this problem as a binary classification problem. While it is easy to get a dataset of interacting protein as positive example, there is no experimentally confirmed non-interacting protein to be considered as a negative set. Therefore, in this paper we solve this problem as a one-class classification problem using One-Class SVM (OCSVM). Using only positive examples (interacting protein pairs) for training, the OCSVM achieves accuracy of 80%. These results imply that protein-protein interaction can be predicted using one-class classifier with reliable accuracy

Universiti Teknologi Malaysia Institutional Repository

Ais-Psmaca: Towards Proposing an Artificial Immune System for Strengthening Psmaca: An Automated Protein Structure Prediction using Multiple Attractor Cellular Automata

Author: Dr. Inampudi Ramesh Babu
P.Kiran Sree
Publication venue: Global Journals Inc. (US)
Publication date: 21/06/2013
Field of study

Predicting the structure of proteins from their amino acid sequences has gained a remarkable attention in recent years. Even though there are some prediction techniques addressing this problem, the approximate accuracy in predicting the protein structure is closely 75%. An automated procedure was evolved with MACA (Multiple Attractor Cellular Automata) for predicting the structure of the protein. Artificial Immune System (AIS-PSMACA) a novel computational intelligence technique is used for strengthening the system (PSMACA) with more adaptability and incorporating more parallelism to the system. Most of the existing approaches are sequential which will classify the input into four major classes and these are designed for similar sequences. AIS-PSMACA is designed to identify ten classes from the sequences that share twilight zone similarity and identity with the training sequences with mixed and hybrid variations. This method also predicts three states (helix, strand, and coil) for the secondary structure. Our comprehensive design considers 10 feature selection methods and 4 classifiers to develop MACA (Multiple Attractor Cellular Automata) based classifiers that are build for each of the ten classes. We have tested the proposed classifier with twilight-zone and 1-high-similarity benchmark datasets with over three dozens of modern competing predictors shows that AIS-PSMACA provides the best overall accuracy that ranges between 80% and 89.8% depending on the dataset

Global Journal of Computer Science and Technology (GJCST)

Recommended from our members

TITER: predicting translation initiation sites by deep learning.

Author: Hu Hailin
Jiang Tao
Zeng Jianyang
Zhang Lei
Zhang Sai
Publication venue: eScholarship, University of California
Publication date: 01/07/2017
Field of study

MotivationTranslation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict TISs and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g. GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification.MethodsWe have developed a deep learning-based framework, named TITER, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TITER extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework.ResultsExtensive tests demonstrated that TITER can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TITER was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TITER prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames on gene expression and the mutational effects influencing translation initiation efficiency.Availability and implementationTITER is available as an open-source software and can be downloaded from https://github.com/zhangsaithu/titer [email protected] or [email protected] informationSupplementary data are available at Bioinformatics online

eScholarship - University of California