Search CORE

5,074 research outputs found

Predicting RNA-binding sites of proteins using support vector machines and evolutionary information

Author: Cheng Cheng-Wei
Hsu Wen-Lian
Hwang Jenn-Kang
Su Emily Chia-Yu
Sung Ting-Yi
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Identification of protein-RNA interaction sites using the information of spatial adjacent residues

Author: Chen Wei
Cheng Yong-Mei
Pan Quan
Zhang Shao-Wu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein-RNA interactions play an important role in numbers of fundamental cellular processes such as RNA splicing, transport and translation, protein synthesis and certain RNA-mediated enzymatic processes. The more knowledge of Protein-RNA recognition can not only help to understand the regulatory mechanism, the site-directed mutagenesis and regulation of RNA–protein complexes in biological systems, but also have a vitally effecting for rational drug design. Results Based on the information of spatial adjacent residues, novel feature extraction methods were proposed to predict protein-RNA interaction sites with SVM-KNN classifier. The total accuracies of spatial adjacent residue profile feature and spatial adjacent residues weighted accessibility solvent area feature are 78%, 67.07% respectively in 5-fold cross-validation test, which are 1.4%, 3.79% higher than that of sequence neighbour residue profile feature and sequence neighbour residue accessibility solvent area feature. Conclusions The results indicate that the performance of feature extraction method using the spatial adjacent information is superior to the sequence neighbour information approach. The performance of SVM-KNN classifier is little better than that of SVM. The feature extraction method of spatial adjacent information with SVM-KNN is very effective for identifying protein-RNA interaction sites and may at least play a complimentary role to the existing methods.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exploiting structural and topological information to improve prediction of RNA-protein binding sites

Author: A Bradley
A del Sol
A del Sol
CW Cheng
E Jeong
E Jeong
G Amitai
H Tjong
HR Guy
I Selin
IH Witten
J Allersa
L Wang
M Kumar
M Terribilini
M Terribilini
OT Kim
P Baldi
R Spriggs
RP Bahadur
S Altschul
S Kawashima
S Shazman
S Tanaka
Stefan R Maetschke
T Fawcett
T Kamada
W Kabsch
WH Press
Y Chen
Zheng Yuan
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

The breast and ovarian cancer susceptibility gene BRCA1 encodes a multifunctional tumor suppressor protein BRCA1, which is involved in regulating cellular processes such as cell cycle, transcription, DNA repair, DNA damage response and chromatin remodeling. BRCA1 protein, located primarily in cell nuclei, interacts with multiple proteins and various DNA targets. It has been demonstrated that BRCA1 protein binds to damaged DNA and plays a role in the transcriptional regulation of downstream target genes. As a key protein in the repair of DNA double-strand breaks, the BRCA1-DNA binding properties, however, have not been reported in detail

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Recommended from our members

Computer simulations explain mutation-induced effects on the DNA editing by adenine base editors.

Author: Komor Alexis C
Paesani Francesco
Rallapalli Kartik L
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Adenine base editors, which were developed by engineering a transfer RNA adenosine deaminase enzyme (TadA) into a DNA editing enzyme (TadA*), enable precise modification of A:T to G⋮C base pairs. Here, we use molecular dynamics simulations to uncover the structural and functional roles played by the initial mutations in the onset of the DNA editing activity by TadA*. Atomistic insights reveal that early mutations lead to intricate conformational changes in the structure of TadA*. In particular, the first mutation, Asp108Asn, induces an enhancement in the binding affinity of TadA to DNA. In silico and in vivo reversion analyses verify the importance of this single mutation in imparting functional promiscuity to TadA* and demonstrate that TadA* performs DNA base editing as a monomer rather than a dimer

eScholarship - University of California

Prediction of protein-protein interaction sites using an ensemble method

Author: Deng Lei
Dong Qiwen
Guan Jihong
Zhou Shuigeng
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Prediction of protein-protein interaction sites is one of the most challenging and intriguing problems in the field of computational biology. Although much progress has been achieved by using various machine learning methods and a variety of available features, the problem is still far from being solved. Results In this paper, an ensemble method is proposed, which combines bootstrap resampling technique, SVM-based fusion classifiers and weighted voting strategy, to overcome the imbalanced problem and effectively utilize a wide variety of features. We evaluate the ensemble classifier using a dataset extracted from 99 polypeptide chains with 10-fold cross validation, and get a AUC score of 0.86, with a sensitivity of 0.76 and a specificity of 0.78, which are better than that of the existing methods. To improve the usefulness of the proposed method, two special ensemble classifiers are designed to handle the cases of missing homologues and structural information respectively, and the performance is still encouraging. The robustness of the ensemble method is also evaluated by effectively classifying interaction sites from surface residues as well as from all residues in proteins. Moreover, we demonstrate the applicability of the proposed method to identify interaction sites from the non-structural proteins (NS) of the influenza A virus, which may be utilized as potential drug target sites. Conclusion Our experimental results show that the ensemble classifiers are quite effective in predicting protein interaction sites. The Sub-EnClassifiers with resampling technique can alleviate the imbalanced problem and the combination of Sub-EnClassifiers with a wide variety of feature groups can significantly improve prediction performance.</p

Crossref

Directory of Open Access Journals

PubMed Central

Cooperative "folding transition" in the sequence space facilitates function-driven evolution of protein families

Author: Aiman Soliman
Anand Padmanabhan
Junjun Yin
Kiumars Soltani
Shaowen Wang
Publication venue
Publication date: 01/01/2018
Field of study

In the protein sequence space, natural proteins form clusters of families which are characterized by their unique native folds whereas the great majority of random polypeptides are neither clustered nor foldable to unique structures. Since a given polypeptide can be either foldable or unfoldable, a kind of "folding transition" is expected at the boundary of a protein family in the sequence space. By Monte Carlo simulations of a statistical mechanical model of protein sequence alignment that coherently incorporates both short-range and long-range interactions as well as variable-length insertions to reproduce the statistics of the multiple sequence alignment of a given protein family, we demonstrate the existence of such transition between natural-like sequences and random sequences in the sequence subspaces for 15 domain families of various folds. The transition was found to be highly cooperative and two-state-like. Furthermore, enforcing or suppressing consensus residues on a few of the well-conserved sites enhanced or diminished, respectively, the natural-like pattern formation over the entire sequence. In most families, the key sites included ligand binding sites. These results suggest some selective pressure on the key residues, such as ligand binding activity, may cooperatively facilitate the emergence of a protein family during evolution. From a more practical aspect, the present results highlight an essential role of long-range effects in precisely defining protein families, which are absent in conventional sequence models.Comment: 13 pages, 7 figures, 2 tables (a new subsection added

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

FigShare

Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information

Author: Chauhan Jagat S
Mishra Nitish K
Raghava Gajendra PS
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Guanosine triphosphate (GTP)-binding proteins play an important role in regulation of G-protein. Thus prediction of GTP interacting residues in a protein is one of the major challenges in the field of the computational biology. In this study, an attempt has been made to develop a computational method for predicting GTP interacting residues in a protein with high accuracy (Acc), precision (Prec) and recall (Rc). Result: All the models developed in this study have been trained and tested on a non-redundant (40% similarity) dataset using five-fold cross-validation. Firstly, we have developed neural network based models using single sequence and PSSM profile and achieved maximum Matthews Correlation Coefficient (MCC) 0.24 (Acc 61.30%) and 0.39 (Acc 68.88%) respectively. Secondly, we have developed a support vector machine (SVM) based models using single sequence and PSSM profile and achieved maximum MCC 0.37 (Prec 0.73, Rc 0.57, Acc 67.98%) and 0.55 (Prec 0.80, Rc 0.73, Acc 77.17%) respectively. In this work, we have introduced a new concept of predicting GTP interacting dipeptide (two consecutive GTP interacting residues) and tripeptide (three consecutive GTP interacting residues) for the first time. We have developed SVM based model for predicting GTP interacting dipeptides using PSSM profile and achieved MCC 0.64 with precision 0.87, recall 0.74 and accuracy 81.37%. Similarly, SVM based model have been developed for predicting GTP interacting tripeptides using PSSM profile and achieved MCC 0.70 with precision 0.93, recall 0.73 and accuracy 83.98%. Conclusion: These results show that PSSM based method performs better than single sequence based method. The prediction models based on dipeptides or tripeptides are more accurate than the traditional model based on single residue. A web server "GTPBinder" http://www.imtech.res.in/raghava/gtpbinder/ webcite based on above models has been developed for predicting GTP interacting residues in a protein

Crossref

Springer - Publisher Connector

PubMed Central

Computational approaches to predict protein functional families and functional sites.

Author: Abbasian M
Orengo CA
Rauer C
Sen N
Waman VP
Publication venue
Publication date: 01/10/2021
Field of study

Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features

UCL Discovery

Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

Author: E Jeong
E Jeong
Gajendra PS Raghava
H Kaur
H Kaur
H Kaur
IB Kuznetsov
M Kumar
M Saito
N Bhardwaj
Nitish K Mishra
RA Bauer
S Ahmad
SF Altschul
T Joachims
V Sobolev
V Vapnik
W Li
Y Korllberg
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism. Results: In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique. Conclusion: This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics

Crossref

Springer - Publisher Connector

PubMed Central

Identification of Mannose Interacting Residues Using Local Composition

Author: A Garg
A Koch
A Malik
A Malik
Anna Tramontano
C Shionyu-Mitsuyama
C Taroni
E Jeong
F Larsen
F Larsen
FA Quiocho
Gajendra P. S. Raghava
GP Raghava
H Kaur
H Kaur
H Nassif
Harinder Singh
HR Ansari
IB Kuznetsov
JS Chauhan
K Julenius
L Sompayrac
LH Bouwman
M Kulharia
M Kumar
M Kumar
M Muraki
M Patra
M Rashid
M Rashid
MM Gromiha
MS Sujatha
N Bhardwaj
Nitish Kumar Mishra
NK Mishra
RA Bauer
S Ahmad
S Hakomori
Sandhya Agarwal
SF Altschul
T Joachims
V Sobolev
VSR Rao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

BACKGROUND: Mannose binding proteins (MBPs) play a vital role in several biological functions such as defense mechanisms. These proteins bind to mannose on the surface of a wide range of pathogens and help in eliminating these pathogens from our body. Thus, it is important to identify mannose interacting residues (MIRs) in order to understand mechanism of recognition of pathogens by MBPs. RESULTS: This paper describes modules developed for predicting MIRs in a protein. Support vector machine (SVM) based models have been developed on 120 mannose binding protein chains, where no two chains have more than 25% sequence similarity. SVM models were developed on two types of datasets: 1) main dataset consists of 1029 mannose interacting and 1029 non-interacting residues, 2) realistic dataset consists of 1029 mannose interacting and 10320 non-interacting residues. In this study, firstly, we developed standard modules using binary and PSSM profile of patterns and got maximum MCC around 0.32. Secondly, we developed SVM modules using composition profile of patterns and achieved maximum MCC around 0.74 with accuracy 86.64% on main dataset. Thirdly, we developed a model on a realistic dataset and achieved maximum MCC of 0.62 with accuracy 93.08%. Based on this study, a standalone program and web server have been developed for predicting mannose interacting residues in proteins (http://www.imtech.res.in/raghava/premier/). CONCLUSIONS: Compositional analysis of mannose interacting and non-interacting residues shows that certain types of residues are preferred in mannose interaction. It was also observed that residues around mannose interacting residues have a preference for certain types of residues. Composition of patterns/peptide/segment has been used for predicting MIRs and achieved reasonable high accuracy. It is possible that this novel strategy may be effective to predict other types of interacting residues. This study will be useful in annotating the function of protein as well as in understanding the role of mannose in the immune system

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central