Towards Reliable Automatic Protein Structure Alignment

A. Caprara; A. Zemla; A.G. Murzin; A.S. Konagurthu; C.A. Rohl; C.B. Do; G. Lancia; H.M. Berman; I.N. Shindyalov; J. Shi; J. Xu; J.F. Gibrat; K. Mizuguchi; L. Kinch; L. Xie; M. Comin; M. Levitt; M. Moakher; M. Sadowski; N.M. Daniels; N.N. Alexandrov; S. Henikoff; S. Subbiah; S.B. Needleman; S.B. Pandit; S.R. Eddy; W. Pirovano; Y. Yang; Y. Ye; Y. Zhang; Y. Zhang; Y. Zhang

research

Towards Reliable Automatic Protein Structure Alignment

Authors: A. Caprara
A. Zemla
A.G. Murzin
A.S. Konagurthu
C.A. Rohl
C.B. Do
G. Lancia
H.M. Berman
I.N. Shindyalov
J. Shi
J. Xu
J.F. Gibrat
K. Mizuguchi
L. Kinch
L. Xie
M. Comin
M. Levitt
M. Moakher
M. Sadowski
N.M. Daniels
N.N. Alexandrov
S. Henikoff
S. Subbiah
S.B. Needleman
S.B. Pandit
S.R. Eddy
W. Pirovano
Y. Yang
Y. Ye
Y. Zhang
Y. Zhang
Y. Zhang
Publication date: 1 January 2013
Publisher
Doi

Abstract

A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than (0.6) and the highest TM-score found by one of the tested methods is higher than (0.5), there is a probability of (42%) that TMalign failed to find TM-scores higher than (0.5), while the same probability is reduced to (2%) if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of (0.5) is used. In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

Similar works

Full text

Available Versions

Crossref

info:doi/10.1007%2F978-3-642-4...

Last time updated on 15/02/2019