Search CORE

25,980 research outputs found

PyMod: sequence similarity searches, multiple sequence-structure alignments, and homology modeling within PyMOL

Author: Bossa Francesco
Bramucci Emanuele
Paiardini Alessandro
Pascarella Stefano
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background: In recent years, an exponential growing number of tools for protein sequence analysis, editing and modeling tasks have been put at the disposal of the scientific community. Despite the vast majority of these tools have been released as open source software, their deep learning curves often discourages even the most experienced users. Results: A simple and intuitive interface, PyMod, between the popular molecular graphics system PyMOL and several other tools (i.e., [PSI-] BLAST, ClustalW, MUSCLE, CEalign and MODELLER) has been developed, to show how the integration of the individual steps required for homology modeling and sequence/structure analysis within the PyMOL framework can hugely simplify these tasks. Sequence similarity searches, multiple sequence and structural alignments generation and editing, and even the possibility to merge sequence and structure alignments have been implemented in PyMod, with the aim of creating a simple, yet powerful tool for sequence and structure analysis and building of homology models. Conclusions: PyMod represents a new tool for the analysis and the manipulation of protein sequences and structures. The ease of use, integration with many sequence retrieving and alignment tools and PyMOL, one of the most used molecular visualization system, are the key features of this tool. Source code, installation instructions, video tutorials and a user's guide are freely available at the URL http://schubert.bio.uniroma1.it/pymod/index.htm

Springer - Publisher Connector

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Algorithm engineering for optimal alignment of protein structure distance matrices

Author: A. Andreeva
A. Caprara
A. Marin
A. Schrijver
C. Berbalk
D. Wu
D.A. Pelta
E. Althaus
G. Mayr
Gunnar W. Klau
H. Hasegawa
H.P. Lenhof
I. Wohlers
Inken Wohlers
L. Holm
N. Malod-Dognin
P. Di Lena
R. Andonov
R. Kolodny
R.H. Lathrop
Rumen Andonov
T. Havel
T. Kawabata
W. Xie
W.R. Taylor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Protein structural alignment is an important problem in computational biology. In this paper, we present first successes on provably optimal pairwise alignment of protein inter-residue distance matrices, using the popular Dali scoring function. We introduce the structural alignment problem formally, which enables us to express a variety of scoring functions used in previous work as special cases in a unified framework. Further, we propose the first mathematical model for computing optimal structural alignments based on dense inter-residue distance matrices. We therefore reformulate the problem as a special graph problem and give a tight integer linear programming model. We then present algorithm engineering techniques to handle the huge integer linear programs of real-life distance matrix alignment problems. Applying these techniques, we can compute provably optimal Dali alignments for the very first time

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Protein sectors: statistical coupling analysis versus conservation

Author: Colwell Lucy J.
Leibler Stanislas
Tesileanu Tiberiu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 16/12/2014
Field of study

Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed "sectors". The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.Comment: 36 pages, 17 figure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

City University of New York

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment

Author: A Löytynoja
A Löytynoja
B Sipos
BG Hall
BG Hall
BP Blackburne
C Chothia
C Dessimoz
C Kemena
C Kemena
C Notredame
CB Do
CL Strope
DA Dalquen
DA Morrison
DH Mathews
ER Mardis
G Blackshields
G Jordan
G Landan
GP Raghava
I Walle Van
J Kim
J Stoye
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JH Havgaard
JP Huelsenbeck
K Mizuguchi
LA Stebbings
M Anisimova
M Pop
MR Aniba
P Gardner
RA Cartwright
RB Russell
RC Edgar
RC Edgar
SA Berger
SF Altschul
T Golubchik
T Koestler
T Lassmann
T Lassmann
T Lassmann
W Fletcher
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/11/2012
Field of study

Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued. Here we present an overview of the main strategies--based on simulation, consistency, protein structure, and phylogeny--and discuss their different advantages and associated risks. We outline a set of desirable characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that there is currently no universally applicable means of benchmarking MSA, and that developers and users of alignment tools should base their choice of benchmark depending on the context of application--with a keen awareness of the assumptions underlying each benchmarking strategy.Comment: Revie

arXiv.org e-Print Archive

Crossref

UCL Discovery

Convergent dynamics in the protease enzymatic superfamily

Author: Carloni Paolo
Carnevale Vincenzo
Micheletti Cristian
Raugei Simone
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2006
Field of study

Proteases regulate various aspects of the life cycle in all organisms by cleaving specific peptide bonds. Their action is so central for biochemical processes that at least 2% of any known genome encodes for proteolytic enzymes. Here we show that selected proteases pairs, despite differences in oligomeric state, catalytic residues and fold, share a common structural organization of functionally relevant regions which are further shown to undergo similar concerted movements. The structural and dynamical similarities found pervasively across evolutionarily distant clans point to common mechanisms for peptide hydrolysis.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

Sissa Digital Library

Towards Reliable Automatic Protein Structure Alignment

Author: A. Caprara
A. Zemla
A.G. Murzin
A.S. Konagurthu
C.A. Rohl
C.B. Do
G. Lancia
H.M. Berman
I.N. Shindyalov
J. Shi
J. Xu
J.F. Gibrat
K. Mizuguchi
L. Kinch
L. Xie
M. Comin
M. Levitt
M. Moakher
M. Sadowski
N.M. Daniels
N.N. Alexandrov
S. Henikoff
S. Subbiah
S.B. Needleman
S.B. Pandit
S.R. Eddy
W. Pirovano
Y. Yang
Y. Ye
Y. Zhang
Y. Zhang
Y. Zhang
Publication venue
Publication date: 01/01/2013
Field of study

A variety of methods have been proposed for structure similarity calculation, which are called structure alignment or superposition. One major shortcoming in current structure alignment algorithms is in their inherent design, which is based on local structure similarity. In this work, we propose a method to incorporate global information in obtaining optimal alignments and superpositions. Our method, when applied to optimizing the TM-score and the GDT score, produces significantly better results than current state-of-the-art protein structure alignment tools. Specifically, if the highest TM-score found by TMalign is lower than (0.6) and the highest TM-score found by one of the tested methods is higher than (0.5), there is a probability of (42%) that TMalign failed to find TM-scores higher than (0.5), while the same probability is reduced to (2%) if our method is used. This could significantly improve the accuracy of fold detection if the cutoff TM-score of (0.5) is used. In addition, existing structure alignment algorithms focus on structure similarity alone and simply ignore other important similarities, such as sequence similarity. Our approach has the capacity to incorporate multiple similarities into the scoring function. Results show that sequence similarity aids in finding high quality protein structure alignments that are more consistent with eye-examined alignments in HOMSTRAD. Even when structure similarity itself fails to find alignments with any consistency with eye-examined alignments, our method remains capable of finding alignments highly similar to, or even identical to, eye-examined alignments.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Crossref