Search CORE

1,655 research outputs found

Toward optimal fragment generations for ab initio protein structure assembly

Author: Xu Dong
Zhang Yang
Publication venue: 'Wiley'
Publication date: 16/10/2012
Field of study

Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein‐structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless‐threading method to generate position‐specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine‐learning‐based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/ . Proteins 2013. © 2012 Wiley Periodicals, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/96355/1/24179_ftp.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/96355/2/PROT_24179_sm_SuppInfo.pd

Crossref

PubMed Central

Deep Blue Documents at the University of Michigan

TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

Author: A Schlessinger
A Schlessinger
A Schlessinger
AG de Brevern
B Rost
B Rost
B Rost
B Xue
C Bystroff
C Haynes
C Mooney
C Zhang
C Zheng
Christian Schönbach
D Xie
DT Jones
E Faraggi
E Faraggi
G Helles
Geoffrey I. Webb
GN Ramachandran
GP Raghava
H Zhang
H Zhang
Hao Tan
HJ Dyson
HS Kang
J Cheng
J Gao
J Gsponer
J Song
J Song
J Song
J Song
J Song
J Song
Jiangning Song
JJ Ward
JS Chauhan
K Chen
K Chen
K Chen
L Chen
L Kurgan
M Kumar
Mingjun Wang
MJ Mizianty
MJ Rooman
MJ Wood
MJ Wood
MK Kalita
MN Nguyen
MN Nguyen
MV Berjanskii
O Dor
O Dor
O Zimmermann
P Chen
P Kountouris
P Kountouris
P Sliz
PC Chen
R Gaudet
R Karchin
R Kuang
R Verma
S Ahmad
S Ahmad
S Liang
S Qiu
S Wu
S Wu
SF Altschul
T Ishida
T Zhang
T Zhang
Tatsuya Akutsu
V Vapnik
V Vapnik
W Kabsch
W Liu
W Zhang
X Miao
X Wang
XY Pan
Y Ofran
Y Ofran
YM Huang
Z Markovic-Housley
Z Yuan
Z Yuan
Z Yuan
Z Yuan
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Monash University Research Portal

Analysis of the conformational profiles of fenamates shows route towards novel, higher accuracy, force-fields for pharmaceuticals

Author: Galek P
Price SL
Uzoh OG
Publication venue
Publication date: 16/02/2015
Field of study

In traditional molecular mechanics force fields, intramolecular non-bonded interactions are modelled as intermolecular interactions, and the form of the torsion potential is based on the conformational profiles of small organic molecules. We investigate how a separate model for the intramolecular forces in pharmaceuticals could be more realistic by analysing the low barrier to rotation of the phenyl ring in the fenamates (substituted N-phenyl-aminobenzoic acids), that results in a wide range of observed angles in the numerous fenamate crystal structures. Although the conformational energy changes by significantly less than 10 kJmol-1 for a complete rotation of the phenyl ring for fenamic acid, the barrier is only small because of small correlated changes in the other bond and torsion angles. The maxima for conformations where the two aromatic rings approach coplanarity arise from steric repulsion, but the maxima when the two rings are approximately perpendicular arise from a combination of an electronic effect and intramolecular dispersion. Representing the ab initio conformational energy profiles as a cosine series alone is ineffective; however, combining a cos2ξ term to represent the electronic barrier with an intramolecular atom-atom exp-6 term for all atom pairs separated by three or more bonds (1-4 interactions) provides a very effective representation. Thus we propose a new, physically motivated, generic analytical model of conformational energy, which could be combined with an intermolecular model to form more accurate force-fields for modelling the condensed phases of pharmaceutical-like organic molecules

UCL Discovery

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

Author: Bohórquez Hugo J.
Patarroyo Manuel Elkin
Suárez Carlos F.
Publication venue
Publication date: 01/01/2017
Field of study

Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction

Author: Zhong Wei
Publication venue: ScholarWorks @ Georgia State University
Publication date: 01/01/2006
Field of study

Protein tertiary structure plays a very important role in determining its possible functional sites and chemical interactions with other related proteins. Experimental methods to determine protein structure are time consuming and expensive. As a result, the gap between protein sequence and its structure has widened substantially due to the high throughput sequencing techniques. Problems of experimental methods motivate us to develop the computational algorithms for protein structure prediction. In this work, the clustering system is used to predict local protein structure. At first, recurring sequence clusters are explored with an improved K-means clustering algorithm. Carefully constructed sequence clusters are used to predict local protein structure. After obtaining the sequence clusters and motifs, we study how sequence variation for sequence clusters may influence its structural similarity. Analysis of the relationship between sequence variation and structural similarity for sequence clusters shows that sequence clusters with tight sequence variation have high structural similarity and sequence clusters with wide sequence variation have poor structural similarity. Based on above knowledge, the established clustering system is used to predict the tertiary structure for local sequence segments. Test results indicate that highest quality clusters can give highly reliable prediction results and high quality clusters can give reliable prediction results. In order to improve the performance of the clustering system for local protein structure prediction, a novel computational model called Clustering Support Vector Machines (CSVMs) is proposed. In our previous work, the sequence-to-structure relationship with the K-means algorithm has been explored by the conventional K-means algorithm. The K-means clustering algorithm may not capture nonlinear sequence-to-structure relationship effectively. As a result, we consider using Support Vector Machine (SVM) to capture the nonlinear sequence-to-structure relationship. However, SVM is not favorable for huge datasets including millions of samples. Therefore, we propose a novel computational model called CSVMs. Taking advantage of both the theory of granular computing and advanced statistical learning methodology, CSVMs are built specifically for each information granule partitioned intelligently by the clustering algorithm. Compared with the clustering system introduced previously, our experimental results show that accuracy for local structure prediction has been improved noticeably when CSVMs are applied

CiteSeerX

ScholarWorks @ Georgia State University

In Silico Investigation of Potential Src Kinase Ligands from Traditional Chinese Medicine

Author: A Aleshin
A Ganesan
A Gucalp
AC Wallace
BK Slinker
BR Brooks
C Oneyama
C-C Chang
Calvin Yu-Chian Chen
CI Herold
CY Chen
CY Chen
CYC Chen
D Gianni
DL Wheeler
EA Jamois
EL Mayer
FM Johnson
G Bianchi
G Noronha
G Scheiner-Bobis
HD Brooks
HJ Mackay
HM Kluger
HS Kim
IS Koh
J Liu
J Zhang
JE Cortes
JS Yoon
K Hasegawa
KC Chen
KW Chang
LF Hennequin
M Ferrandi
M Ferrandi
M Ferrandi
M Montes
ME Irwin
MF Sun
MG Fury
ML Johnson
MR Sharma
MT Khan
P Ferrari
P Manunta
PC Chang
R Roskoski
Ramón Campos-Olivas
RS Finn
S Thomas
SC Yang
SC Yang
SS Chang
SY Choi
T Miyake
TH Keller
TR Kowar
TT Chang
TY Tsai
W Xu
Weng Ieong Tou
Y Liu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Src kinase is an attractive target for drug development based on its established relationship with cancer and possible link to hypertension. The suitability of traditional Chinese medicine (TCM) compounds as potential drug ligands for further biological evaluation was investigated using structure-based, ligand-based, and molecular dynamics (MD) analysis. Isopraeroside IV, 9alpha-hydroxyfraxinellone-9-O-beta-D-glucoside (9HFG) and aurantiamide were the top three TCM candidates identified from docking. Hydrogen bonds and hydrophobic interactions were the primary forces governing docking stability. Their stability with Src kinase under a dynamic state was further validated through MD and torsion angle analysis. Complexes formed by TCM candidates have lower total energy estimates than the control Sacaratinib. Four quantitative-structural activity relationship (QSAR) in silico verifications consistently suggested that the TCM candidates have bioactive properties. Docking conformations of 9HFG and aurantiamide in the Src kinase ATP binding site suggest potential inhibitor-like characteristics, including competitive binding at the ATP binding site (Lys295) and stabilization of the catalytic cleft integrity. The TCM candidates have significantly lower ligand internal energies and are estimated to form more stable complexes with Src kinase than Saracatinib. Structure-based and ligand-based analysis support the drug-like potential of 9HFG and aurantiamide and binding mechanisms reveal the tendency of these two candidates to compete for the ATP binding site

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Generative Tertiary Structure-based RNA Design

Author: Gao Zhangyang
Li Stan Z.
Tan Cheng
Publication venue
Publication date: 25/01/2023
Field of study

Learning from 3D biological macromolecules with artificial intelligence technologies has been an emerging area. Computational protein design, known as the inverse of protein structure prediction, aims to generate protein sequences that will fold into the defined structure. Analogous to protein design, RNA design is also an important topic in synthetic biology, which aims to generate RNA sequences by given structures. However, existing RNA design methods mainly focus on the secondary structure, ignoring the informative tertiary structure, which is commonly used in protein design. To explore the complex coupling between RNA sequence and 3D structure, we introduce an RNA tertiary structure modeling method to efficiently capture useful information from the 3D structure of RNA. For a fair comparison, we collect abundant RNA data and split the data according to tertiary structures. With the standard dataset, we conduct a benchmark by employing structure-based protein design approaches with our RNA tertiary structure modeling method. We believe our work will stimulate the future development of tertiary structure-based RNA design and bridge the gap between the RNA 3D structures and sequences

arXiv.org e-Print Archive

Learning to Evolve Structural Ensembles of Unfolded and Disordered Proteins Using Experimental Solution Data

Author: Forman-Kay Julie D
Haghighatlari Mojtaba
Head-Gordon Teresa
Li Jie
Liu Zi-Hao
Namini Ashley
Teixeira Joao Miguel Correia
Zhang Oufan
Publication venue
Publication date: 24/07/2022
Field of study

We have developed a Generative Recurrent Neural Networks (GRNN) that learns the probability of the next residue torsions $X_{i+1}=\ [\phi_{i+1},\psi_{i+1},\omega _{i+1}, \chi_{i+1}]

from the previous residue in the sequence

X_i$ to generate new IDP conformations. In addition, we couple the GRNN with a Bayesian model, X-EISD, in a reinforcement learning step that biases the probability distributions of torsions to take advantage of experimental data types such as J-couplingss, NOEs and PREs. We show that updating the generative model parameters according to the reward feedback on the basis of the agreement between structures and data improves upon existing approaches that simply reweight static structural pools for disordered proteins. Instead the GRNN "DynamICE" model learns to physically change the conformations of the underlying pool to those that better agree with experiment

arXiv.org e-Print Archive

Soliton concepts and the protein structure

Author: A. G. Murzin
Andrei Krokhotin
Antti J. Niemi
C. Levinthal
L. D. Faddeev
P. G. Kevrekidis
Xubiao Peng
Publication venue: 'American Physical Society (APS)'
Publication date: 18/09/2011
Field of study

Structural classification shows that the number of different protein folds is surprisingly small. It also appears that proteins are built in a modular fashion, from a relatively small number of components. Here we propose to identify the modular building blocks of proteins with the dark soliton solution of a generalized discrete nonlinear Schrodinger equation. For this we show that practically all protein loops can be obtained simply by scaling the size and by joining together a number of copies of the soliton, one after another. The soliton has only two loop specific parameters and we identify their possible values in Protein Data Bank. We show that with a collection of 200 sets of parameters, each determining a soliton profile that describes a different short loop, we cover over 90% of all proteins with experimental accuracy. We also present two examples that describe how the loop library can be employed both to model and to analyze the structure of folded proteins.Comment: 7 pages 6 fig

arXiv.org e-Print Archive

Crossref