Search CORE

10 research outputs found

Subcellular localization for Gram Positive and Gram Negative Bacterial Proteins using Linear Interpolation Smoothing Model

Author: Dehzangi A.
Lal Sunil P.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: 'Elsevier BV'
Publication date: 07/12/2015
Field of study

Protein subcellular localization is an important topic in proteomics since it is related to a proteins overall function, help in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinder other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins

University of the South Pacific Electronic Research Repository

Protein fold recognition using genetic algorithm optimized voting scheme and profile bigram

Author: Dehzangi Abdollah
Imoto S.
Lal Sunil P.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: JSW
Publication date: 01/01/2016
Field of study

In biology, identifying the tertiary structure of a protein helps determine its functions. A step towards tertiary structure identification is predicting a protein’s fold. Computational methods have been applied to determine a protein’s fold by assembling information from its structural, physicochemical and/or evolutionary properties. It has been shown that evolutionary information helps improve prediction accuracy. In this study, a scheme is proposed that uses the genetic algorithm (GA) to optimize a weighted voting scheme to improve protein fold recognition. This scheme incorporates k-separated bigram transition probabilities for feature extraction, which are based on the Position Specific Scoring Matrix (PSSM). A set of SVM classifiers are used for initial classification, whereupon their predictions are consolidated using the optimized weighted voting scheme. This scheme has been demonstrated on the Ding and Dubchak (DD), Extended Ding and Dubchak (EDD) and Taguchi and Gromhia (TG) datasets benchmarked data sets

University of the South Pacific Electronic Research Repository

Opal+: length - specific MoRF prediction in intrinsically disordered protein sequences

Author: Patil Ashwini
Raicar Gaurav
Sharma Alokanand
Sharma Ronesh
Tsunoda Tatsuhiko
Publication venue: WILEY-VCH Verlag GmbH & Co.
Publication date: 15/10/2018
Field of study

Intrinsically disordered proteins (IDPs) contain long unstructured regions, which play an important role in their function. These intrinsically disordered regions (IDRs) participate in binding events through regions called molecular recognition features (MoRFs). Computational prediction of MoRFs helps identify the potentially functional regions in IDRs. In this study, OPAL+, a novel MoRF predictor, is presented. OPAL+ uses separate models to predict MoRFs of varying lengths along with incorporating the hidden Markov model (HMM) profiles and physicochemical properties of MoRFs and their flanking regions. Together, these features help OPAL+ achieve a marginal performance improvement of 0.4-0.7% over its predecessor for diverse MoRF test sets. This performance improvement comes at the expense of increased run time as a result of the requirement of HMM profiles. OPAL+ is available for download at https://github.com/roneshsharma/OPAL-plus/wiki/OPAL-plus-Download

Crossref

University of the South Pacific Electronic Research Repository

Supplementary material for Opal+: length-specific MoRF prediction in intrinsically disordered protein sequences

Author: Patil A.
Raicar Gaurav
Sharma Alokanand
Sharma Ronesh
Tsunoda Tatsuhiko
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

University of the South Pacific Electronic Research Repository

Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids

Author: Dehzangi Abdollah
Lal Sunil P.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Predicting the three-dimensional (3-D) structure of a protein is an important task in the field of bioinformatics and biological sciences. However, directly predicting the 3-D structure from the primary structure is hard to achieve. Therefore, predicting the fold or structural class of a protein sequence is generally used as an intermediate step in determining the protein's 3-D structure. For protein fold recognition (PFR) and structural class prediction (SCP), two steps are required – feature extraction step and classification step. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In this study, we explore the importance of utilizing the physicochemical properties of amino acids for improving PFR and SCP accuracies. For this, we propose a Forward Consecutive Search (FCS) scheme which aims to strategically select physicochemical attributes that will supplement the existing feature extraction techniques for PFR and SCP. An exhaustive search is conducted on all the existing 544 physicochemical attributes using the proposed FCS scheme and a subset of physicochemical attributes is identified. Features extracted from these selected attributes are then combined with existing syntactical-based and evolutionary-based features, to show an improvement in the recognition and prediction performance on benchmark datasets

University of the South Pacific Electronic Research Repository

Genetic algorithm for an optimized weighted voting scheme incorporating k-separated bigram transition probabilities to improve protein fold recognition

Author: Dehzangi A.
Imoto S.
Lal Sunil P.
Lyons J.
Miyano S.
Paliwal K.K.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: Institute of Electrical and Electronic Engineers
Publication date: 01/01/2014
Field of study

In biology, identifying the tertiary structure of a protein helps determine its functions. A step towards tertiary structure identification is predicting a protein's fold. Computational methods have been applied to determine a protein's fold by assembling information from its structural, physicochemical and/or evolutionary properties. It has been shown that evolutionary data helps improve prediction accuracy. In this study, a scheme is proposed that uses the genetic algorithm (GA) to optimize a weighted voting system to improve protein fold recognition. This scheme incorporates k-separated bigram transition probabilities for feature extraction, which are based on the Position Specific Scoring Matrix (PSSM). A set of SVM classifiers are used for initial classification, whereupon their predictions are consolidated using the optimized weighted voting system. This scheme has been demonstrated on the Ding and Dubchak (DD) benchmarked data set

University of the South Pacific Electronic Research Repository

Probabilistic expression of spatially varied amino acid dimers into general form of Chou's pseudo amino acid composition for protein fold recognition

Author: Dehzangi A.
Imoto S.
Lal Sunil P.
Lyons J.
Miyano S.
Paliwal K.K.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: 'Elsevier BV'
Publication date: 07/09/2015
Field of study

Background Identification of the tertiary structure (3D structure) of a protein is a fundamental problem in biology which helps in identifying its functions. Predicting a protein׳s fold is considered to be an intermediate step for identifying the tertiary structure of a protein. Computational methods have been applied to determine a protein׳s fold by assembling information from its structural, physicochemical and/or evolutionary properties. Methods In this study, we propose a scheme in which a feature extraction technique that extracts probabilistic expressions of amino acid dimers, which have varying degree of spatial separation in the primary sequences of proteins, from the Position Specific Scoring Matrix (PSSM). SVM classifier is used to create a model from extracted features for fold recognition. Results The performance of the proposed scheme is evaluated against three benchmarked datasets, namely the Ding and Dubchak, Extended Ding and Dubchak, and Taguchi and Gromiha datasets. Conclusions The proposed scheme performed well in the experiments conducted, providing improvements over previously published results in literature

University of the South Pacific Electronic Research Repository

Protein structural class prediction via k - separated bigrams using position specific scoring matrix

Author: Ananthanarayanan Rajeshkannan
Biswas N.
Dehzangi A.
Lal Sunil P.
Lyons J.
Paliwal K.K.
Raicar Gaurav
Saini Harsh
Sharma Alokanand
Publication venue: Fuji Technology Press Co,. Ltd.
Publication date: 01/01/2014
Field of study

Protein structural class prediction (SCP) is as important task in identifying protein tertiary structure and protein functions. In this study, we propose a feature extraction technique to predict secondary structures. The technique utilizes bigram (of adjacent and k-separated amino acids) information derived from Position Specific Scoring Matrix (PSSM). The technique has shown promising results when evaluated on benchmarked Ding and Dubchak dataset

University of the South Pacific Electronic Research Repository

OPAL: prediction of MoRF regions in intrinsically disordered protein sequence

Author: Alok Sharma
Ashwini Patil
Cheng
Disfani
Dosztányi
Dyson
Edwards
Gaurav Raicar
Hamelryck
Heffernan
Heffernan
John Hancock
Kavianpour
Lee
Li
Liu
Lyons
Malhis
Malhis
Malhis
Mohan
Mousavian
Oldfield
Peng
Ronesh Sharma
Sharma
Sharma
Sharma
Sharma
Tatsuhiko Tsunoda
Tompa
Uversky
Vacic
Wright
Xia
Yang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Intrinsically disordered proteins lack stable 3-dimensional structure and play a crucial role in performing various biological functions. Key to their biological function are the molecular recognition features (MoRFs) located within long disordered regions. Computationally identifying these MoRFs from disordered protein sequences is a challenging task. In this study, we present a new MoRF predictor, OPAL, to identify MoRFs in disordered protein sequences. OPAL utilizes two independent sources of information computed using different component predictors. The scores are processed and combined using common averaging method. The first score is computed using a component MoRF predictor which utilizes composition and sequence similarity of MoRF and non-MoRF regions to detect MoRFs. The second score is calculated using half-sphere exposure (HSE), solvent accessible surface area (ASA) and backbone angle information of the disordered protein sequence, using information from the amino acid properties of flanks surrounding the MoRFs to distinguish MoRF and non-MoRF residues. OPAL is evaluated using test sets that were previously used to evaluate MoRF predictors, MoRFpred, MoRFchibi and MoRFchibi-web. The results demonstrate that OPAL outperforms all the available MoRF predictors and is the most accurate predictor available for MoRF prediction

Crossref

University of the South Pacific Electronic Research Repository

OPAL: prediction of MoRF regions in intrinsically disordered protein sequences

Author: Alok Sharma
Ashwini Patil
Cheng
Disfani
Dosztányi
Dyson
Edwards
Gaurav Raicar
Hamelryck
Heffernan
Heffernan
John Hancock
Kavianpour
Lee
Li
Liu
Lyons
Malhis
Malhis
Malhis
Mohan
Mousavian
Oldfield
Peng
Ronesh Sharma
Sharma
Sharma
Sharma
Sharma
Tatsuhiko Tsunoda
Tompa
Uversky
Vacic
Wright
Xia
Yang
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref