24 research outputs found

    The Use of Fuzzy ARTMAP and Modified Fuzzy ARTMAP to Identify Low Risk Coronary Care Patients

    Get PDF
    The performance of fuzzy ARTMAP and modified fuzzy Artmap is compared using real-world data from a medical domain, the task being to predict the death or survival of patients admitted to a coronary care ward. Modified fuzzy ARTMAP is shown to perform consistently more accurately than fuzzy ARTMAP and is also much less prone to variations in performance with different orderings of a training data. However, modified fuzzy ARTMAP does not show as large an improvement in performance as fuzzy ARTMAP when employed in the voting strategy. When unanimous voting decisions alone are considered, fuzzy ARTMAP is able to increase significantly accuracy in identifying survivors at the cost of decreased coverage of cases. This allows the identification of a subset of patients who have low-risk of death from their condition and are thus potentially suitable for early discharge from hospital

    Implementing decision tree-based algorithms in medical diagnostic decision support systems

    Get PDF
    As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems. Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks. We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models

    Utilizing Temporal Information in The EHR for Developing a Novel Continuous Prediction Model

    Get PDF
    Type 2 diabetes mellitus (T2DM) is a nation-wide prevalent chronic condition, which includes direct and indirect healthcare costs. T2DM, however, is a preventable chronic condition based on previous clinical research. Many prediction models were based on the risk factors identified by clinical trials. One of the major tasks of the T2DM prediction models is to estimate the risks for further testing by HbA1c or fasting plasma glucose to determine whether the patient has or does not have T2DM because nation-wide screening is not cost-effective. Those models had substantial limitations on data quality, such as missing values. In this dissertation, I tested the conventional models which were based on the most widely used risk factors to predict the possibility of developing T2DM. The AUC was an average of 0.5, which implies the conventional model cannot be used to screen for T2DM risks. Based on this result, I further implemented three types of temporal representations, including non-temporal representation, interval-temporal representation, and continuous-temporal representation for building the T2DM prediction model. According to the results, continuous-temporal representation had the best performance. Continuous-temporal representation was based on deep learning methods. The result implied that the deep learning method could overcome the data quality issue and could achieve better performance. This dissertation also contributes to a continuous risk output model based on the seq2seq model. This model can generate a monotonic increasing function for a given patient to predict the future probability of developing T2DM. The model is workable but still has many limitations to overcome. Finally, this dissertation demonstrates some risks factors which are underestimated and are worthy for further research to revise the current T2DM screening guideline. The results were still preliminary. I need to collaborate with an epidemiologist and other fields to verify the findings. In the future, the methods for building a T2DM prediction model can also be used for other prediction models of chronic conditions

    HIV analysis using computational intelligence

    Get PDF
    In this study, a new method to analyze HIV using a combination of autoencoder networks and genetic algorithms is proposed. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey. The autoencoder model is then compared with a conventional feedforward neural network model and yields a classification accuracy of 92% compared to 84% obtained for the conventional feedforward model. The autoencoder model is then used to propose a new method of approximating missing entries in the HIV database using ant colony optimization. This method is able to estimate missing input to an accuracy of 80%. The estimated missing input values are then used to analyze HIV. The autoencoder network classifier model yields a classification accuracy of 81% in the presence of missing input values. The feedforward neural network classifier model yields a classification accuracy of 82% in the presence of missing input values. A control mechanism is proposed to assess the effect of demographic properties on the HIV status of individuals, based on inverse neural networks, and autoencoder networks-based-on-genetic algorithms. This control mechanism is aimed at understanding whether HIV susceptibility can be controlled by modifying some of the demographic properties. The inverse neural network control model has accuracies of 77% and 82%, meanwhile the genetic algorithm model has accuracies of 77% and 92%, for the prediction of educational level of individuals, and gravidity, respectively. HIV modelling using neuro-fuzzy models is then investigated, and rules are extracted, which provide more valuable insight. The classification accuracy obtained by the neuro-fuzzy model is 86%. A rough set approximation is then investigated for rule extraction, and it is found that the rules present simplistic and understandable relationships on how the demographic properties affect HIV risk. The study concludes by investigating a model for automatic relevance determination, to determine which of the demographic properties is important for HIV modelling. A comparison is done between using the full input data set and the data set using the input parameters selected by the technique for the HIV classification. Age of the individual, gravidity, province, region, reported pregnancy and educational level were amongst the input parameters selected as relevant for classification of an individual’s HIV risk. This study thus proposes models, which can be used to understand HIV dynamics, and can be used by policy-makers to more effectively understand the demographic influences driving HIV infection

    Machine learning based data pre-processing for the purpose of medical data mining and decision support

    Get PDF
    Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. Sometimes, improved data quality is itself the goal of the analysis, usually to improve processes in a production database and the designing of decision support. As medicine moves forward there is a need for sophisticated decision support systems that make use of data mining to support more orthodox knowledge engineering and Health Informatics practice. However, the real-life medical data rarely complies with the requirements of various data mining tools. It is often inconsistent, noisy, containing redundant attributes, in an unsuitable format, containing missing values and imbalanced with regards to the outcome class label.Many real-life data sets are incomplete, with missing values. In medical data mining the problem with missing values has become a challenging issue. In many clinical trials, the medical report pro-forma allow some attributes to be left blank, because they are inappropriate for some class of illness or the person providing the information feels that it is not appropriate to record the values for some attributes. The research reported in this thesis has explored the use of machine learning techniques as missing value imputation methods. The thesis also proposed a new way of imputing missing value by supervised learning. A classifier was used to learn the data patterns from a complete data sub-set and the model was later used to predict the missing values for the full dataset. The proposed machine learning based missing value imputation was applied on the thesis data and the results are compared with traditional Mean/Mode imputation. Experimental results show that all the machine learning methods which we explored outperformed the statistical method (Mean/Mode).The class imbalance problem has been found to hinder the performance of learning systems. In fact, most of the medical datasets are found to be highly imbalance in their class label. The solution to this problem is to reduce the gap between the minority class samples and the majority class samples. Over-sampling can be applied to increase the number of minority class sample to balance the data. The alternative to over-sampling is under-sampling where the size of majority class sample is reduced. The thesis proposed one cluster based under-sampling technique to reduce the gap between the majority and minority samples. Different under-sampling and over-sampling techniques were explored as ways to balance the data. The experimental results show that for the thesis data the new proposed modified cluster based under-sampling technique performed better than other class balancing techniques.In further research it is found that the class imbalance problem not only affects the classification performance but also has an adverse effect on feature selection. The thesis proposed a new framework for feature selection for class imbalanced datasets. The research found that, using the proposed framework the classifier needs less attributes to show high accuracy, and more attributes are needed if the data is highly imbalanced.The research described in the thesis contains the flowing four novel main contributions.a) Improved data mining methodology for mining medical datab) Machine learning based missing value imputation methodc) Cluster Based semi-supervised class balancing methodd) Feature selection framework for class imbalance datasetsThe performance analysis and comparative study show that the use of proposed method of missing value imputation, class balancing and feature selection framework can provide an effective approach to data preparation for building medical decision support

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    Fuzzy Logic

    Get PDF
    The capability of Fuzzy Logic in the development of emerging technologies is introduced in this book. The book consists of sixteen chapters showing various applications in the field of Bioinformatics, Health, Security, Communications, Transportations, Financial Management, Energy and Environment Systems. This book is a major reference source for all those concerned with applied intelligent systems. The intended readers are researchers, engineers, medical practitioners, and graduate students interested in fuzzy logic systems
    corecore