Search CORE

9 research outputs found

Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date: 19/10/2015
Field of study

<div>Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups published metagenome-wide association studies (MGWAS) result that found meta-biomarkers related with type 2 diabetes. However, One key problem of analyzing genomic data is that how to deal with the ultra-high dimensionality of features. From a statistical viewpoint it is challenging to filter true factors in high dimensional data. Various methods and techniques have been proposed on this issue, which can only achieve limited prediction performance and poor interpretability. New statistical procedure with higher performance and clear interpretability is appealing in analyzing high dimensional data. To address this problem, we apply an excellent statistical variable selection procedure called iterative sure independence screening to gene profiles that obtained from metagenome sequencing, and 48/24 meta-markers were selected in Chinese/European cohorts as predictors with 0.97/0.99 accuracy in AUC (area under the curve), which showed a better performance than other model selection methods, respectively. These results demonstrate the power and utility of data mining technologies within the large-scale and ultra-high dimensional genomic-related dataset for diagnostic and predictive markers identifying.</div

Directory of Open Access Journals

The Francis Crick Institute

Averaged AUC obtained from SVM classifier combined with three variable selection methods.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

SVM classifier estimated as a function of sample size in a 50 × 10-fold cross-validation setting. We show accuracy of 60-gene of ensemble feature selection and 48-gene of ISIS-SCAD on Chinese dataset. For European dataset, the accuracy of ensemble feature selection is computed on 60-gene and the accuracy of ISIS-SCAD is on 24-gene.</p

The Francis Crick Institute

Results of simulated example II: accuracy of ISIS in including the true model {X1,X2,X3,X4}.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

Accuracy of ISIS on different correlation ρ and dimensionality p setting under jointly contribution scenario. 100 data sets consisting of 50 observations were simulated and 20 variables were selected for computing the accuracy.Results of simulated example II: accuracy of ISIS in including the true model {X1,X2,X3,X4}.</p

The Francis Crick Institute

Data.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

Chinese and European gut microbiota datasets of type 2 diabetes (T2D) used in our work. The ‘sd’ means standard deviation. BMI means body mass index.Data.</p

The Francis Crick Institute

Results of simulated example I: accuracy of ISIS in including the true model {X1,X2,X3}.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

Accuracy of ISIS on different correlation ρ and dimensionality p setting under nonlinear relationship. For each model, 100 data sets consisting of 50 observations were simulated and 20 variables were selected for computing the accuracy.Results of simulated example I: accuracy of ISIS in including the true model {X1,X2,X3}.</p

The Francis Crick Institute

AUC.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

SVM classifier trained as a function of the size of signature, for mRMR, ensemble of lasso and ensemble of elastic net, in a 10-fold cross-validation setting on Chinese and European datasets respectively.</p

The Francis Crick Institute

AUC obtained by ISIS-SCAD (Chinese).

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

AUC of signature size in {10, 15, 18, 23, 26, 28, 34, 41, 43, 48, 50, 61, 63}, combined with four classification algorithms in a 10-fold cross-validation. For each classification method, we highlighted the best result.AUC obtained by ISIS-SCAD (Chinese).</p

The Francis Crick Institute

AUC obtained from SVM classifier estimated on genes selected by ISIS-SCAD and ensemble feature selection.

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

Signature of size in {10, 15, 18, 23, 26, 28, 34, 41, 43, 48, 50, 61, 63} on Chinese dataset and size in {4, 11, 15, 22, 24, 26, 27, 28, 29, 32, 34, 35, 36} on European dataset in a 10-fold cross-validation setting.</p

The Francis Crick Institute

AUC obtained by ISIS-SCAD (European).

Author: Dongfang Li (298810)
Fuhao Zou (815557)
Honglong Wu (237941)
Ke Zhou (131917)
Lihua Cai (815556)
Publication venue
Publication date
Field of study

AUC of signature size in {4, 11, 15, 22, 24, 26, 27, 28, 29, 32, 34, 35, 36}, combined with four classification algorithms in a 10-fold cross-validation. For each classification method, we highlighted the best result.AUC obtained by ISIS-SCAD (European).</p

The Francis Crick Institute

Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method

Averaged AUC obtained from SVM classifier combined with three variable selection methods.

Results of simulated example II: accuracy of ISIS in including the true model {<i>X</i><sub>1</sub>,<i>X</i><sub>2</sub>,<i>X</i><sub>3</sub>,<i>X</i><sub>4</sub>}.

Data.

Results of simulated example I: accuracy of ISIS in including the true model {<i>X</i><sub>1</sub>,<i>X</i><sub>2</sub>,<i>X</i><sub>3</sub>}.

AUC.

AUC obtained by ISIS-SCAD (Chinese).

AUC obtained from SVM classifier estimated on genes selected by ISIS-SCAD and ensemble feature selection.

AUC obtained by ISIS-SCAD (European).