36 research outputs found

    Extraction process of the 188-dimensional (188D) feature vectors (FV).

    No full text
    <p>Sequences are input and processed by analyzing amino acid composition, distribution and protein physicochemical properties, FV1–FV188 are output as feature vectors.</p

    Influential factors for success rate of 1<sup>st</sup> and 2<sup>nd</sup> hierarchical layers.

    No full text
    <p>Influential factors for success rate of 1<sup>st</sup> and 2<sup>nd</sup> hierarchical layers.</p

    Success rate achieved by three classifiers with different sequence identity.

    No full text
    <p>The two graphs show the results of two datasets((a) SCOP version 1.75, (b) SCOP version 1.75A). Their similar success rates demonstrate the robustness of our model. As identity increases it becomes less stringent and success rate rises. It also shows our ensemble classifier outperforms other two classifiers.</p

    Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier

    Get PDF
    <div><p>The analysis of biological information from protein sequences is important for the study of cellular functions and interactions, and protein fold recognition plays a key role in the prediction of protein structures. Unfortunately, the prediction of protein fold patterns is challenging due to the existence of compound protein structures. Here, we processed the latest release of the Structural Classification of Proteins (SCOP, version 1.75) database and exploited novel techniques to impressively increase the accuracy of protein fold classification. The techniques proposed in this paper include ensemble classifying and a hierarchical framework, in the first layer of which similar or redundant sequences were deleted in two manners; a set of base classifiers, fused by various selection strategies, divides the input into seven classes; in the second layer of which, an analogous ensemble method is adopted to predict all protein folds. To our knowledge, it is the first time all protein folds can be intelligently detected hierarchically. Compared with prior studies, our experimental results demonstrated the efficiency and effectiveness of our proposed method, which achieved a success rate of 74.21%, which is much higher than results obtained with previous methods (ranging from 45.6% to 70.5%). When applied to the second layer of classification, the prediction accuracy was in the range between 23.13% and 46.05%. This value, which may not be remarkably high, is scientifically admirable and encouraging as compared to the relatively low counts of proteins from most fold recognition programs. The web server Hierarchical Protein Fold Prediction (HPFP) is available at <a href="http://datamining.xmu.edu.cn/software/hpfp" target="_blank">http://datamining.xmu.edu.cn/software/hpfp</a>.</p> </div

    Comparison of success rate among several studies.

    No full text
    <p>Our work outperforms all previous works with an accuracy of 74.21%.</p

    Performance on different classifiers on protein fold recognition (sequence at 35% identity).

    No full text
    <p>Performance on different classifiers on protein fold recognition (sequence at 35% identity).</p

    The architecture of our ensemble classifier.

    No full text
    <p>The training dataset is classified by all base classifiers. After K-Means clustering and circulating combination the best ensemble result is achieved.</p

    Additional file 7: of High-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation (HITS-CLIP) reveals Argonaute-associated microRNAs and targets in Schistosoma japonicum

    No full text
    Optimized transfection conditions for miRNA transfection into HEK293T cells and Sequence alignment of target site examined and corresponding mutated site. (PDF 264 kb
    corecore