3 research outputs found

    Predicting Diabetes Onset: an Ensemble Supervised Learning Approach

    Get PDF
    An exploratory research is presented to gauge the impact of feature selection on heterogeneous ensembles. The task is to predict diabetes onset with healthcare data obtained from UC Irvine (UCI) database. Evidence suggests that accuracy and diversity are the two vital requirements to achieve good ensembles. Therefore, the research presented in this paper exploits diversity from heterogeneous base classifiers; and the optimisation effect of feature subset selection in order to improve accuracy. Five widely used classifiers are employed for the ensembles and a meta-classifier is used to aggregate their outputs. The results are presented and compared with similar studies that used the same dataset within the literature. It is shown that by using the proposed method, diabetes onset prediction can be done with higher accuracy

    Ensemble-based Supervised Learning for Predicting Diabetes Onset

    Get PDF
    The research presented in this thesis aims to address the issue of undiagnosed diabetes cases. The current state of knowledge is that one in seventy people in the United Kingdom are living with undiagnosed diabetes, and only one in a hundred people could identify the main signs of diabetes. Some of the tools available for predicting diabetes are either too simplistic and/or rely on superficial data for inference. On the positive side, the National Health Service (NHS) are improving data recording in this domain by offering health check to adults aged 40 - 70. Data from such programme could be utilised to mitigate the issue of superficial data; but also help to develop a predictive tool that facilitates a change from the current reactive care, onto one that is proactive. This thesis presents a tool based on a machine learning ensemble for predicting diabetes onset. Ensembles often perform better than a single classifier, and accuracy and diversity have been highlighted as the two vital requirements for constructing good ensemble classifiers. Experiments in this thesis explore the relationship between diversity from heterogeneous ensemble classifiers and the accuracy of predictions through feature subset selection in order to predict diabetes onset. Data from a national health check programme (similar to NHS health check) was used. The aim is to predict diabetes onset better than other similar studies within the literature. For the experiments, predictions from five base classifiers (Sequential Minimal Optimisation (SMO), Radial Basis Function (RBF), Naïve Bayes (NB), Repeated Incremental Pruning to Produce Error Reduction (RIPPER) and C4.5 decision tree), performing the same task, are exploited in all possible combinations to construct 26 ensemble models. The training data feature space was searched to select the best feature subset for each classifier. Selected subsets are used to train the classifiers and their predictions are combined using k-Nearest Neighbours algorithm as meta-classifier. Results are analysed using four performance metrics (accuracy, sensitivity, specificity and AUC) to determine (i) if ensembles always perform better than single classifier; and (ii) the impact of diversity (from heterogeneous classifiers) and accuracy (through feature subset selection) on ensemble performance. At base classification level, RBF produced better results than the other four classifiers with 78%accuracy, 82% sensitivity, 73% specificity and 85% AUC. A comparative study shows that RBF model is more accurate than 9 ensembles, more sensitive than 13 ensembles, more specific than 9 ensembles; and produced better AUC than 25 ensembles. This means that ensembles do not always perform better than its constituent classifiers. Of those ensembles that performed better than RBF, the combination of C4.5, RIPPER and NB produced the highest results with 83% accuracy, 87% sensitivity, 79% specificity, and 86% AUC. When compared to the RBF model, the result shows 5.37% accuracy improvement which is significant (p = 0.0332). The experiments show how data from medical health examination can be utilised to address the issue of undiagnosed cases of diabetes. Models constructed with such data would facilitate the much desired shift from preventive to proactive care for individuals at high risk of diabetes. From the machine learning view point, it was established that ensembles constructed based on diverse and accurate base learners, have the potential to produce significant improvement in accuracy, compared to its individual constituent classifiers. In addition, the ensemble presented in this thesis is at least 1% and at most 23% more accurate than similar research studies found within the literature. This validates the superiority of the method implemented

    Improving healthcare system usability without real users: A semi-parallel design approach

    Get PDF
    Copyright © 2015 IGI Global. This paper describes an early stage usability study conducted on a prototype system designed to capture and analyse Patient Reported Outcome Measures (PROMs) activities. The system - PROMS 2.0, was developed by Bluespier for the trauma and orthopaedic department in Trafford Hospital, Manchester, United Kingdom (UK). The Centre for Health and Social Care Informatics (CHaSCI), Liverpool John Moores University (LJMU) examined the system without real users, identified potential usability issues and suggested possible solutions for improvements before final release by Bluespier. Three different approaches were adopted for evaluating user interface (UI) design without users. The first approach is the Cognitive Walkthrough (CW), a task-oriented technique capable of identifying issues through action sequence required to perform a task. The second approach is action analysis which predicts the time a skilled user would need to perform a task. The third approach is heuristic evaluation which tends to identify problems based on recognised standards. Results support the argument from relevant cognitive psychology theories and user-centric design principles that UI evaluation without real users is a useful tool in yielding rapid output for subsequent enhancement. It is concluded that semi-parallel design concept could be the key to timely delivery of software design projects
    corecore