Search CORE

36 research outputs found

Extraction process of the 188-dimensional (188D) feature vectors (FV).

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Sequences are input and processed by analyzing amino acid composition, distribution and protein physicochemical properties, FV1–FV188 are output as feature vectors.</p

FigShare

Influential factors for success rate of 1st and 2nd hierarchical layers.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Influential factors for success rate of 1st and 2nd hierarchical layers.</p

FigShare

Success rate achieved by three classifiers with different sequence identity.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

The two graphs show the results of two datasets((a) SCOP version 1.75, (b) SCOP version 1.75A). Their similar success rates demonstrate the robustness of our model. As identity increases it becomes less stringent and success rate rises. It also shows our ensemble classifier outperforms other two classifiers.</p

FigShare

Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date: 20/02/2013
Field of study

<div>The analysis of biological information from protein sequences is important for the study of cellular functions and interactions, and protein fold recognition plays a key role in the prediction of protein structures. Unfortunately, the prediction of protein fold patterns is challenging due to the existence of compound protein structures. Here, we processed the latest release of the Structural Classification of Proteins (SCOP, version 1.75) database and exploited novel techniques to impressively increase the accuracy of protein fold classification. The techniques proposed in this paper include ensemble classifying and a hierarchical framework, in the first layer of which similar or redundant sequences were deleted in two manners; a set of base classifiers, fused by various selection strategies, divides the input into seven classes; in the second layer of which, an analogous ensemble method is adopted to predict all protein folds. To our knowledge, it is the first time all protein folds can be intelligently detected hierarchically. Compared with prior studies, our experimental results demonstrated the efficiency and effectiveness of our proposed method, which achieved a success rate of 74.21%, which is much higher than results obtained with previous methods (ranging from 45.6% to 70.5%). When applied to the second layer of classification, the prediction accuracy was in the range between 23.13% and 46.05%. This value, which may not be remarkably high, is scientifically admirable and encouraging as compared to the relatively low counts of proteins from most fold recognition programs. The web server Hierarchical Protein Fold Prediction (HPFP) is available at <a href="http://datamining.xmu.edu.cn/software/hpfp" target="_blank">http://datamining.xmu.edu.cn/software/hpfp</a>. </div

Directory of Open Access Journals

PubMed Central

FigShare

Algorithm 1. Circulating Combination of EFSS.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Algorithm 1. Circulating Combination of EFSS.</p

FigShare

Comparison of success rate among several studies.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Our work outperforms all previous works with an accuracy of 74.21%.</p

FigShare

Performance on different classifiers on protein fold recognition (sequence at 35% identity).

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Performance on different classifiers on protein fold recognition (sequence at 35% identity).</p

FigShare

The architecture of our ensemble classifier.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

The training dataset is classified by all base classifiers. After K-Means clustering and circulating combination the best ensemble result is achieved.</p

FigShare

Additional file 7: of High-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation (HITS-CLIP) reveals Argonaute-associated microRNAs and targets in Schistosoma japonicum

Author: Jing Zhao (21160)
Qingfeng Zhang (89642)
Rong Luo (187167)
Weiqing Pan (29351)
Xindong Xu (253352)
Ying Zou (46281)
Publication venue
Publication date
Field of study

Optimized transfection conditions for miRNA transfection into HEK293T cells and Sequence alignment of target site examined and corresponding mutated site. (PDF 264 kb

FigShare

Loadings of most informative features* on principle component factors.

Author: Caihuan Ke (245901)
Chen Lin (95910)
Ji Qin (310208)
Quan Zou (157931)
Xiangrong Liu (380124)
Yi Jiang (40197)
Ying Zou (46281)
Publication venue
Publication date
Field of study

*Only the first three are shown.</p

FigShare

Extraction process of the 188-dimensional (188D) feature vectors (FV).

Influential factors for success rate of 1<sup>st</sup> and 2<sup>nd</sup> hierarchical layers.

Success rate achieved by three classifiers with different sequence identity.

Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier

Algorithm 1. Circulating Combination of EFSS.

Comparison of success rate among several studies.

Performance on different classifiers on protein fold recognition (sequence at 35% identity).

The architecture of our ensemble classifier.

Additional file 7: of High-throughput sequencing of RNAs isolated by cross-linking immunoprecipitation (HITS-CLIP) reveals Argonaute-associated microRNAs and targets in Schistosoma japonicum

Loadings of most informative features<sup>*</sup> on principle component factors.