1 research outputs found

    Significant Gene Array Analysis and Cluster-Based Machine Learning for Disease Class Prediction

    Get PDF
    Gene expression analysis has been of major interest to biostatisticians for many decades. Such studies are necessary for the understanding of disease risk assessment and prediction, so that medical professionals and scientists alike may learn how to better create treatment plans to lessen symptoms and perhaps even find cures. In this study, we will investigate various gene expression analyses and machine learning techniques for disease class prediction, as well as assess predictive validity of these models and uncover differentially expressed (DE) genes for their relevant pathology datasets. Multiple gene expression datasets will be used to test model accuracies and will be obtained using the Affymetrix U133A platform (GPL96). Significant Analysis of Microarrays (SAM) had been used to identify potential disease biomarkers, followed by these predictive models: (a) random forest, (b) random forest with Gene eXpression Network Analysis (GXNA), (c) RF++, (d) LASSO, and (e) Bayesian Neural Networks. One of the intended goals for this study is to find clusters of co-expressed genes and identify the effect of clustering classification based on knowledge in gene expression data/microarray data. The other goal is to determine the usefulness of Automatic Relevancy Determination in Bayesian neural networks
    corecore