8 research outputs found

    How reliably can we predict the reliability of protein structure predictions?

    Get PDF
    Background: Comparative methods have been the standard techniques for in silico protein structure prediction. The prediction is based on a multiple alignment that contains both reference sequences with known structures and the sequence whose unknown structure is predicted. Intensive research has been made to improve the quality of multiple alignments, since misaligned parts of the multiple alignment yield misleading predictions. However, sometimes all methods fail to predict the correct alignment, because the evolutionary signal is too weak to find the homologous parts due to the large number of mutations that separate the sequences. Results: Stochastic sequence alignment methods define a posterior distribution of possible multiple alignments. They can highlight the most likely alignment, and above that, they can give posterior probabilities for each alignment column. We made a comprehensive study on the HOMSTRAD database of structural alignments, predicting secondary structures in four different ways. We showed that alignment posterior probabilities correlate with the reliability of secondary structure predictions, though the strength of the correlation is different for different protocols. The correspondence between the reliability of secondary structure predictions and alignment posterior probabilities is the closest to the identity function when the secondary structure posterior probabilities are calculated from the posterior distribution of multiple alignments. The largest deviation from the identity function has been obtained in the case of predicting secondary structures from a single optimal pairwise alignment. We also showed that alignment posterior probabilities correlate with the 3D distances between C α amino acids in superimposed tertiary structures. Conclusion: Alignment posterior probabilities can be used to a priori detect errors in comparative models on the sequence alignment level. </p

    How reliably can we predict the reliability of protein structure predictions?

    No full text
    Abstract Background Comparative methods have been the standard techniques for in silico protein structure prediction. The prediction is based on a multiple alignment that contains both reference sequences with known structures and the sequence whose unknown structure is predicted. Intensive research has been made to improve the quality of multiple alignments, since misaligned parts of the multiple alignment yield misleading predictions. However, sometimes all methods fail to predict the correct alignment, because the evolutionary signal is too weak to find the homologous parts due to the large number of mutations that separate the sequences. Results Stochastic sequence alignment methods define a posterior distribution of possible multiple alignments. They can highlight the most likely alignment, and above that, they can give posterior probabilities for each alignment column. We made a comprehensive study on the HOMSTRAD database of structural alignments, predicting secondary structures in four different ways. We showed that alignment posterior probabilities correlate with the reliability of secondary structure predictions, though the strength of the correlation is different for different protocols. The correspondence between the reliability of secondary structure predictions and alignment posterior probabilities is the closest to the identity function when the secondary structure posterior probabilities are calculated from the posterior distribution of multiple alignments. The largest deviation from the identity function has been obtained in the case of predicting secondary structures from a single optimal pairwise alignment. We also showed that alignment posterior probabilities correlate with the 3D distances between Cα amino acids in superimposed tertiary structures. Conclusion Alignment posterior probabilities can be used to a priori detect errors in comparative models on the sequence alignment level.</p

    Posterior probabilities of correctly predicting secondary structure types with stochastic pairwise alignment methods as a function of alignment posterior probabilities

    No full text
    The black diagonal shows the identity function. The statistics have been generated on 12 families from the HOMSTRAD database, see Table 1.<p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p

    3D distances between the aligned amino acids as a function of multiple alignment posterior probabilities

    No full text
    The 3D distances were calculated from the HOMSTRAD pdb files containing the superimposed structures of sequence families. Multiple alignments are MPD estimations for the 12 selected families described in Table 1. based on MCMC samples. Boxes show the average distances, lines show the range between the low and high quartiles.<p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p

    Sensitivity of secondary structure predictions as a function of alignment posterior probabilities

    No full text
    Sensitivity is defined as /(+ ) where stands for the true positive estimations and stands for the false negative estimations. The Viterbi alignments were obtained for all possible homologous pairs in the HOMSTRAD database, the MPD alignments were estimated for the 12 selected families in Table 1.<p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p

    3D distances between the aligned amino acids as a function of pairwise alignment posterior probabilities

    No full text
    The 3D distances were calculated from the HOMSTRAD pdb files containing the superimposed structures of sequence families. Pairwise alignments were obtained by the Viterbi algorithm on the entire HOMSTRAD database (black) as well as on the 12 selected families described in Table 1. (light green). Boxes show the average distances, lines show the range between the low and high quartiles.<p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p

    Maximum Posterior Decoding estimations for the multiple sequence alignment of the subtilase family in the HOMSTRAD database

    No full text
    The two estimations were given based on samples from two Markov chains with different starting points. The similarity between the two independent estimations shows good convergence and mixing of the Markov chain.<p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p

    Number of true positive and false positive predictions as function of the alignment posterior probabilities

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "How reliably can we predict the reliability of protein structure predictions?"</p><p>http://www.biomedcentral.com/1471-2105/9/137</p><p>BMC Bioinformatics 2008;9():137-137.</p><p>Published online 3 Mar 2008</p><p>PMCID:PMC2324098.</p><p></p
    corecore