10 research outputs found
A computational framework for complex disease stratification from multiple large-scale datasets.
BACKGROUND: Multilevel data integration is becoming a major area of research in systems biology. Within this area, multi-'omics datasets on complex diseases are becoming more readily available and there is a need to set standards and good practices for integrated analysis of biological, clinical and environmental data. We present a framework to plan and generate single and multi-'omics signatures of disease states. METHODS: The framework is divided into four major steps: dataset subsetting, feature filtering, 'omics-based clustering and biomarker identification. RESULTS: We illustrate the usefulness of this framework by identifying potential patient clusters based on integrated multi-'omics signatures in a publicly available ovarian cystadenocarcinoma dataset. The analysis generated a higher number of stable and clinically relevant clusters than previously reported, and enabled the generation of predictive models of patient outcomes. CONCLUSIONS: This framework will help health researchers plan and perform multi-'omics big data analyses to generate hypotheses and make sense of their rich, diverse and ever growing datasets, to enable implementation of translational P4 medicine
Enhanced class imbalance learning methods for support vector machines application to human miRNA gene classification
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Additional file 2: of A computational framework for complex disease stratification from multiple large-scale datasets
Complete results of the enrichment analysis between clusters. (XLSX 4293 kb)</span
A computational framework for complex disease stratification from multiple large-scale datasets.
BACKGROUND: Multilevel data integration is becoming a major area of research in systems biology. Within this area, multi-'omics datasets on complex diseases are becoming more readily available and there is a need to set standards and good practices for integrated analysis of biological, clinical and environmental data. We present a framework to plan and generate single and multi-'omics signatures of disease states. METHODS: The framework is divided into four major steps: dataset subsetting, feature filtering, 'omics-based clustering and biomarker identification. RESULTS: We illustrate the usefulness of this framework by identifying potential patient clusters based on integrated multi-'omics signatures in a publicly available ovarian cystadenocarcinoma dataset. The analysis generated a higher number of stable and clinically relevant clusters than previously reported, and enabled the generation of predictive models of patient outcomes. CONCLUSIONS: This framework will help health researchers plan and perform multi-'omics big data analyses to generate hypotheses and make sense of their rich, diverse and ever growing datasets, to enable implementation of translational P4 medicine
Additional file 3: of A computational framework for complex disease stratification from multiple large-scale datasets
Table S7. Estimated accuracy and standard deviation of the RFE procedure. Table S8. Accuracy and Kappa values of the Random Forest models in the training set. Table S9. Performances values for the Random Forest model in the testing set. Figure S11. Relative importance of the top 20 predictors building the final model of the RF. The importance axis is scaled, with the mRNA expression of CD3D scaled to 100% and the methylation state of POLA2 to 0% (not shown). (DOCX 18 kb
A computational framework for complex disease stratification from multiple large-scale datasets
Background: Multilevel data integration is becoming a major area of research in systems biology. Within this area, multi-'omics datasets on complex diseases are becoming more readily available and there is a need to set standards and good practices for integrated analysis of biological, clinical and environmental data. We present a framework to plan and generate single and multi-'omics signatures of disease states.Methods: The framework is divided into four major steps: dataset subsetting, feature filtering, 'omics-based clustering and biomarker identification.Results: We illustrate the usefulness of this framework by identifying potential patient clusters based on integrated multi-'omics signatures in a publicly available ovarian cystadenocarcinoma dataset. The analysis generated a higher number of stable and clinically relevant clusters than previously reported, and enabled the generation of predictive models of patient outcomes. Conclusions: This framework will help health researchers plan and perform multi-'omics big data analyses to generate hypotheses and make sense of their rich, diverse and ever growing datasets, to enable implementation of translational P4 medicine.</p
Additional file 4: of A computational framework for complex disease stratification from multiple large-scale datasets
DIABLO sPLSDA model results. (DOCX 18966 kb