Search CORE

9 research outputs found

Maximum Cliques in Protein Structure Comparison

Author: A. Andreeva
A. Caprara
A. Godzik
D. Strickland
E. Tomita
I. Lerman
J. Konc
J. Martin
J.F. Gibrat
M. Fredman
M. Sierk
P.R.J. Östergård
R. Andonov
R. Karp
Publication venue
Publication date: 01/01/2009
Field of study

Computing the similarity between two protein structures is a crucial task in molecular biology, and has been extensively investigated. Many protein structure comparison methods can be modeled as maximum clique problems in specific k-partite graphs, referred here as alignment graphs. In this paper, we propose a new protein structure comparison method based on internal distances (DAST) which is posed as a maximum clique problem in an alignment graph. We also design an algorithm (ACF) for solving such maximum clique problems. ACF is first applied in the context of VAST, a software largely used in the National Center for Biotechnology Information, and then in the context of DAST. The obtained results on real protein alignment instances show that our algorithm is more than 37000 times faster than the original VAST clique solver which is based on Bron & Kerbosch algorithm. We furthermore compare ACF with one of the fastest clique finder, recently conceived by Ostergard. On a popular benchmark (the Skolnick set) we observe that ACF is about 20 times faster in average than the Ostergard's algorithm

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Towards Reliable Automatic Protein Structure Alignment

Author: A. Caprara
A. Zemla
A.G. Murzin
A.S. Konagurthu
C.A. Rohl
C.B. Do
G. Lancia
H.M. Berman
I.N. Shindyalov
J. Shi
J. Xu
J.F. Gibrat
K. Mizuguchi
L. Kinch
L. Xie
M. Comin
M. Levitt
M. Moakher
M. Sadowski
N.M. Daniels
N.N. Alexandrov
S. Henikoff
S. Subbiah
S.B. Needleman
S.B. Pandit
S.R. Eddy
W. Pirovano
Y. Yang
Y. Ye
Y. Zhang
Y. Zhang
Y. Zhang
Publication venue
Publication date: 01/01/2013
Field of study

A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than (0.6) and the highest TM-score found by one of the tested methods is higher than (0.5), there is a probability of (42%) that TMalign failed to find TM-scores higher than (0.5), while the same probability is reduced to (2%) if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of (0.5) is used. In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Crossref

Capturing Long-Term Dependencies for Protein Secondary Structure Prediction

Author: A.A. Salamov
B. Rost
J.F. Gibrat
S. Hua
S.K. Riis
T.M. Yi
V. Francesco Di
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Abstract. Bidirectional recurrent neural network(BRNN) is a noncausal system that captures both upstream and downstream information for protein secondary structure prediction. Due to the problem of vanishing gradients, the BRNN can not learn remote information efficiently. To limit this problem, we propose segmented memory recurrent neural network(SMRNN) and use SMRNNs to replace the standard RNNs in BRNN. The resulting architecture is called bidirectional segmented-memory recurrent neural network(BSMRNN). Our experiment with BSMRNN for protein secondary structure prediction on the RS126 set indicates improvement in the prediction accuracy.

CiteSeerX

Crossref