168 research outputs found

    SiteSeek: Post-translational modification analysis using adaptive locality-effective kernel methods and new profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Post-translational modifications have a substantial influence on the structure and functions of protein. Post-translational phosphorylation is one of the most common modification that occur in intracellular proteins. Accurate prediction of protein phosphorylation sites is of great importance for the understanding of diverse cellular signalling processes in both the human body and in animals. In this study, we propose a new machine learning based protein phosphorylation site predictor, SiteSeek. SiteSeek is trained using a novel compact evolutionary and hydrophobicity profile to detect possible protein phosphorylation sites for a target sequence. The newly proposed method proves to be more accurate and exhibits a much stable predictive performance than currently existing phosphorylation site predictors.</p> <p>Results</p> <p>The performance of the proposed model was compared to nine existing different machine learning models and four widely known phosphorylation site predictors with the newly proposed PS-Benchmark_1 dataset to contrast their accuracy, sensitivity, specificity and correlation coefficient. SiteSeek showed better predictive performance with 86.6% accuracy, 83.8% sensitivity, 92.5% specificity and 0.77 correlation-coefficient on the four main kinase families (CDK, CK2, PKA, and PKC).</p> <p>Conclusion</p> <p>Our newly proposed methods used in SiteSeek were shown to be useful for the identification of protein phosphorylation sites as it performed much better than widely known predictors on the newly built PS-Benchmark_1 dataset.</p

    PAC learning using Nadaraya-Watson estimator based on orthonormal systems

    Get PDF
    Regression or function classes of Euclidean type with compact support and certain smoothness properties are shown to be PAC learnable by the Nadaraya-Watson estimator based on complete orthonormal systems. While requiring more smoothness properties than typical PAC formulations, this estimator is computationally efficient, easy to implement, and known to perform well in a number of practical applications. The sample sizes necessary for PAC learning of regressions or functions under sup norm cost are derived for a general orthonormal system. The result covers the widely used estimators based on Haar wavelets, trignometric functions, and Daubechies wavelets

    Estimation of Non-sharp Support Boundaries

    No full text
    Let X1, ..., Xn be independent identically distributed observations from an unknown probability density f(·), such that its support G = supp f is a subset of the unit square in 2. We consider the problem of estimating G from the sample X1, ..., Xn, under the assumption that the boundary of G is a function of smoothness [gamma] and that the values of density f decrease to 0 as the power [alpha] of the distance from the boundary. We show that a certain piecewise-polynomial estimator of G has optimal rate of convergence (namely, the rate n-[gamma]/(([alpha] + 1)[gamma] + 1)) within this class of densities.

    Book reviews

    No full text
    • …
    corecore