10 research outputs found

    From Sequence to Structure And Back Again: An Alignment Tale

    Get PDF
    Heringa, J. [Promotor

    Integrating protein secondary structure prediction and multiple sequence alignment

    No full text
    Modern protein secondary structure prediction methods are based on exploiting evolutionary information contained in multiple sequence alignments. Critical steps in the secondary structure prediction process are (i) the selection of a set of sequences that are homologous to a given query sequence, (ii) the choice of the multiple sequence alignment method, and (iii) the choice of the secondary structure prediction method. Because of the close relationship between these three steps and their critical influence on the prediction results, secondary structure prediction has received increased attention from the bioinformatics community over the last few years. In this treatise, we discuss recent developments in computational methods for protein secondary structure prediction and multiple sequence alignment, focus on the integration of these methods, and provide some recommendations for state-of-the-art secondary structure prediction in practice. © 2004 Bentham Science Publishers Ltd

    The influence of gapped positions in multiple sequence alignments on secondary structure prediction.

    No full text
    All currently leading protein secondary structure prediction methods use a multiple protein sequence alignment to predict the secondary structure of the top sequence. In most of these methods, prior to prediction, alignment positions showing a gap in the top sequence are deleted, consequently leading to shrinking of the alignment-and loss of position-specific information. In this paper we investigate the effect of this removal of information on secondary structure prediction accuracy. To this end, we have designed SymSSP, an algorithm that post-processes the predicted secondary structure of all sequences in a multiple sequence alignment by (i) making use of the alignment's evolutionary information and (ii) re-introducing most of the information that would otherwise be lost. The post-processed information is then given to a new dynamic programming routine that produces an optimally segmented consensus secondary structure for each of the multiple alignment sequences. We have tested our method on the state-of-the-art secondary structure prediction methods PHD, PROFsec, SSPro2 and JNET using the HOMSTRAD database of reference alignments. Our consensus-deriving dynamic programming strategy is consistently better at improving the segmentation quality of the predictions compared to the commonly used majority voting technique. In addition, we have applied several weighting schemes from the literature to our novel consensus-deriving dynamic programming routine. Finally, we have investigated the level of noise introduced by prediction errors into the consensus and show that predictions of edges of helices and strands are half the time wrong for all the four tested prediction methods. © 2004 Elsevier Ltd. All rights reserved

    CONTRAlign: discriminative training for protein sequence alignment

    No full text
    1 Introduction In comparative structural biology studies, analyzing or predicting protein three-dimensional structure often begins with identifying patterns of amino acid substitution via protein sequence alignment. While the evolutionary informationobtained from alignments can provide insights into protein structure, constructing accurate alignments may be difficult when proteins share significant struc-tural similarity but little sequence similarity. Indeed, for modern alignment tools, alignment quality drops rapidly when the sequences compared have lower than25 % identity, the "twilight zone " of protein alignment [1]

    Practical multiple Sequence alignment

    No full text
    Abstract Multiple sequence alignment as a means of comparing DNA, RNA or amino acid sequences is an essential precondition for various analyses, including structure prediction, modeling binding sites, phylogeny or function prediction. This range of applications implies a demand for versatile, flexible and specialized meth- ods to compute accurate alignments. This chapter summarizes the key algorithmic insights gained in the past years to facilitate both, an easy understanding of the current multiple sequence alignment literature and to enable the readers to use and apply current tools in their own everyday research

    Protein Multiple Sequence Alignment

    No full text
    corecore