987 research outputs found

    A survey on artificial intelligence based techniques for diagnosis of hepatitis variants

    Get PDF
    Hepatitis is a dreaded disease that has taken the lives of so many people over the recent past years. The research survey shows that hepatitis viral disease has five major variants referred to as Hepatitis A, B, C, D, and E. Scholars over the years have tried to find an alternative diagnostic means for hepatitis disease using artificial intelligence (AI) techniques in order to save lives. This study extensively reviewed 37 papers on AI based techniques for diagnosing core hepatitis viral disease. Results showed that Hepatitis B (30%) and C (3%) were the only types of hepatitis the AI-based techniques were used to diagnose and properly classified out of the five major types, while (67%) of the paper reviewed diagnosed hepatitis disease based on the different AI based approach but were not classified into any of the five major types. Results from the study also revealed that 18 out of the 37 papers reviewed used hybrid approach, while the remaining 19 used single AI based approach. This shows no significance in terms of technique usage in modeling intelligence into application. This study reveals furthermore a serious gap in knowledge in terms of single hepatitis type prediction or diagnosis in all the papers considered, and recommends that the future road map should be in the aspect of integrating the major hepatitis variants into a single predictive model using effective intelligent machine learning techniques in order to reduce cost of diagnosis and quick treatment of patients

    A predictive method for hepatitis disease diagnosis using ensembles of neuro-fuzzy technique

    Get PDF
    Background: Hepatitis is an inflammation of the liver, most commonly caused by a viral infection. Supervised data mining techniques have been successful in hepatitis disease diagnosis through a set of datasets. Many methods have been developed by the aids of data mining techniques for hepatitis disease diagnosis. The majority of these methods are developed by single learning techniques. In addition, these methods do not support the ensemble learning of the data. Combining the outputs of several predictors can result in improved accuracy in classification problems. This study aims to propose an accurate method for the hepatitis disease diagnosis by taking the advantages of ensemble learning. Methods: We use Non-linear Iterative Partial Least Squares to perform the data dimensionality reduction, Self-Organizing Map technique for clustering task and ensembles of Neuro-Fuzzy Inference System for predicting the hepatitis disease. We also use decision trees for the selection of most important features in the experimental dataset. We test our method on a real-world dataset and present our results in comparison with the latest results of previous studies. Results: The results of our analyses on the dataset demonstrated that our method performance is superior to the Neural Network, ANFIS, K-Nearest Neighbors and Support Vector Machine. Conclusions: The method has potential to be used as an intelligent learning system for hepatitis disease diagnosis in the healthcare. © 2018 The Author

    Knowledge Mining from Clinical Datasets Using Rough Sets and Backpropagation Neural Network

    Get PDF
    The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict the presence or absence of a disease by learning from the minimal set of attributes that has been extracted from the clinical dataset. In this work rough set indiscernibility relation method with backpropagation neural network (RS-BPNN) is used. This work has two stages. The first stage is handling of missing values to obtain a smooth data set and selection of appropriate attributes from the clinical dataset by indiscernibility relation method. The second stage is classification using backpropagation neural network on the selected reducts of the dataset. The classifier has been tested with hepatitis, Wisconsin breast cancer, and Statlog heart disease datasets obtained from the University of California at Irvine (UCI) machine learning repository. The accuracy obtained from the proposed method is 97.3%, 98.6%, and 90.4% for hepatitis, breast cancer, and heart disease, respectively. The proposed system provides an effective classification model for clinical datasets

    A Rough Set Approach to Dimensionality Reduction for Performance Enhancement in Machine Learning

    Get PDF
    Machine learning uses complex mathematical algorithms to turn data set into a model for a problem domain. Analysing high dimensional data in their raw form usually causes computational overhead because the higher the size of the data, the higher the time it takes to process it. Therefore, there is a need for a more robust dimensionality reduction approach, among other existing methods, for feature projection (extraction) and selection from data set, which can be passed to a machine learning algorithm for optimal performance. This paper presents a generic mathematical approach for transforming data from a high dimensional space to low dimensional space in such a manner that the intrinsic dimension of the original data is preserved using the concept of indiscernibility, reducts, and the core of the rough set theory. The flue detection dataset available on the Kaggle website was used in this research for demonstration purposes. The original and reduced datasets were tested using a logistic regression machine learning algorithm yielding the same accuracy of 97% with a training time of 25 min and 11 min respectively

    Identifying Effective Features and Classifiers for Short Term Rainfall Forecast Using Rough Sets Maximum Frequency Weighted Feature Reduction Technique

    Get PDF
    Precise rainfall forecasting is a common challenge across the globe in meteorological predictions. As rainfall forecasting involves rather complex dynamic parameters, an increasing demand for novel approaches to improve the forecasting accuracy has heightened. Recently, Rough Set Theory (RST) has attracted a wide variety of scientific applications and is extensively adopted in decision support systems. Although there are several weather prediction techniques in the existing literature, identifying significant input for modelling effective rainfall prediction is not addressed in the present mechanisms. Therefore, this investigation has examined the feasibility of using rough set based feature selection and data mining methods, namely Naïve Bayes (NB), Bayesian Logistic Regression (BLR), Multi-Layer Perceptron (MLP), J48, Classification and Regression Tree (CART), Random Forest (RF), and Support Vector Machine (SVM), to forecast rainfall. Feature selection or reduction process is a process of identifying a significant feature subset, in which the generated subset must characterize the information system as a complete feature set. This paper introduces a novel rough set based Maximum Frequency Weighted (MFW) feature reduction technique for finding an effective feature subset for modelling an efficient rainfall forecast system. The experimental analysis and the results indicate substantial improvements of prediction models when trained using the selected feature subset. CART and J48 classifiers have achieved an improved accuracy of 83.42% and 89.72%, respectively. From the experimental study, relative humidity2 (a4) and solar radiation (a6) have been identified as the effective parameters for modelling rainfall prediction

    An Augmented Artificial Intelligence Approach for Chronic Diseases Prediction

    Get PDF
    Chronic diseases are increasing in prevalence and mortality worldwide. Early diagnosis has therefore become an important research area to enhance patient survival rates. Several research studies have reported classification approaches for specific disease prediction. In this paper, we propose a novel augmented artificial intelligence approach using an artificial neural network (ANN) with particle swarm optimization (PSO) to predict five prevalent chronic diseases including breast cancer, diabetes, heart attack, hepatitis, and kidney disease. Seven classification algorithms are compared to evaluate the proposed model's prediction performance. The ANN prediction model constructed with a PSO based feature extraction approach outperforms other state-of-the-art classification approaches when evaluated with accuracy. Our proposed approach gave the highest accuracy of 99.67%, with the PSO. However, the classification model's performance is found to depend on the attributes of data used for classification. Our results are compared with various chronic disease datasets and shown to outperform other benchmark approaches. In addition, our optimized ANN processing is shown to require less time compared to random forest (RF), deep learning and support vector machine (SVM) based methods. Our study could play a role for early diagnosis of chronic diseases in hospitals, including through development of online diagnosis systems

    Rancang Bangun Sistem Pendukung Keputusan Berbasis Web untuk Diagnosa Penyakit

    Get PDF
    Pada beberapa kasus, diagnosa penyakit tidak dapat dilakukan dengan mudah karena beberapa tanda dan gejala mungkin saling beririsan dengan penyakit lain. Sistem pendukung keputusan merupakan salah satu pendekatan yang dapat digunakan untuk membantu dokter dalam pengambilan keputusan diagnosa. Pada penelitian ini dirancang suatu sistem pendukung keputusan yang dapat digunakan untuk mendiganosa suatu penyakit sesuai dengan kebutuhan. Sistem pendukung dirancang agar dapat terintegrasi dengan data pasien sehingga secara praktis dapat digunakan di rumah sakit. Sistem diimplementasikan ke dalam sistem berbasis web dengan bahasa pemrograman PHP dan Relational Database Management System MySQL. Jenis sistem pendukung keputusan yang dirancang pada penelitian merupakan nonknowledge-based sytem dengan algoritma machine learning yang digunakan yaitu Naive Bayes. Berdasarkan pengujian performansi, pada saat kondisi normal performansi sistem baik dan tidak terpengaruh jumlah user yang melakukan akses. Selain itu tidak terdapat perbedaan performa yang signifikan saat kondisi normal dan pada saat kondisi terdapat lonjakan trafik

    Data Mining with Supervised Instance Selection Improves Artificial Neural Network Classification Accuracy

    Get PDF
    IDSs may monitor intrusion logs, traffic control packets, and assaults. Nets create large amounts of data. IDS log characteristics are used to detect whether a record or connection was attacked or regular network activity. Reduced feature size aids machine learning classification. This paper describes a standardised and systematic intrusion detection classification approach. Using dataset signatures, the Naive Bayes Algorithm, Random Tree, and Neural Network classifiers are assessed. We examine the feature reduction efficacy of PCA and the fisheries score in this study. The first round of testing uses a reduced dataset without decreasing the components set, and the second uses principal components analysis. PCA boosts classification accuracy by 1.66 percent. Artificial immune systems, inspired by the human immune system, use learning, long-term memory, and association to recognise and v-classify. Introduces the Artificial Neural Network (ANN) classifier model and its development issues. Iris and Wine data from the UCI learning repository proves the ANN approach works. Determine the role of dimension reduction in ANN-based classifiers. Detailed mutual information-based feature selection methods are provided. Simulations from the KDD Cup'99 demonstrate the method's efficacy. Classifying big data is important to tackle most engineering, health, science, and business challenges. Labelled data samples train a classifier model, which classifies unlabeled data samples into numerous categories. Fuzzy logic and artificial neural networks (ANNs) are used to classify data in this dissertation


    Get PDF
    Diabetic diagnosis is an important research in health care domain to analyze relevant microorganisms at an earlier stage. Due to large growth in world’s population, feature subset selection model receives a great deal in any domain of research and also a reliable tool for diabetic diagnosis. Several data mining techniques have been developed to evaluate the significant causes of diabetes with least sets of risk factors. These minimum set is selected without considering the potential significance of the risk factors and optimal feature subset selection, hence it failed to diagnosis the pattern of diabetes accurately. In order to improve the feature subset selection, an Integration of Fuzzy Rough Set Theory and Optimized Genetic algorithm (IFRST-OGA) is introduced. The main objective of the IFRST-OGA is to find optimal risk factors for efficient pattern recognition on diabetes healthcare data. Initially, feature selection is performed using Fuzzy Rough Set Theory (FRST) for diagnosing the diabetes. After that, the Optimized Genetic Algorithm (OGA) is applied which mainly searches for an optimal feature subset through the selection, crossover, and mutation operations to diagnose the disease at an earlier stage. This helps to identify the risk factor and diagnosing the diabetes disease efficiently. Experimental results show that the proposed IFRST-OGA increases the performance in terms of true positive rate, computation time and diabetes diagnosing accuracy

    Big data analytics for preventive medicine

    Get PDF
    © 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations