Results of a comparison of the new SVM-based method with the sequence-based prediction method based on the ‘specificity-confering code’ by Stachelhaus

Abstract

<p><b>Copyright information:</b></p><p>Taken from "Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs)"</p><p>Nucleic Acids Research 2005;33(18):5799-5808.</p><p>Published online 12 Oct 2005</p><p>PMCID:PMC1253831.</p><p>© The Author 2005. Published by Oxford University Press. All rights reserved</p> () and Challis . () (For simplicity we refer to the latter as the ‘Stachelhaus method’): of the 1230 adenylation domains (with automatically extracted from the June 2005 version of UniProt) 70% or 858 obtained consistent predictions by both predictors (white sectors). For most of these consistent predictions (54% of the total or 666) the Stachelhaus method was based on an exact match with a known ‘specificity-conferring code’, the others had at least an 70% match. To 2.4% or 29 sequences none of the predictors can assign any specificity (no match ≥70%, diagonal hatches). An 18% or 217 sequences could be classified only by the SVMs and not by the Stachelhaus method (light gray sector), and 18 A domains (1.5%) could not be classified by the SVMs but by the Stachelhaus method (cross-hatched), two of them are rare specificities. The Stachelhaus predictions for the rest are mainly based on 70% matches to known specificity ‘codes’. For 108 sequences (8.8%) the predictions were inconsistent but 38 of them (3% of the total, gray sector) had matches to rare amino acids that were not used for training the SVMs. The remaining 70 incompatible predictions were mainly based on ≤80% identity matches with known ‘specificity-conferring codes’ (black sector)

    Similar works

    Full text

    thumbnail-image

    Available Versions