6 research outputs found

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    Effectiveness of Support Vector Machines in Medical Data mining

    Get PDF
    The idea of medical data mining is to extract hidden knowledge in medical field using data mining techniques. One of the positive aspects is to discover the important patterns. It is possible to identify patterns even if we do not have fully understood the casual mechanisms behind those patterns. In this case, data mining prepares the ability of research and discovery that may not have been evident. This paper analyzes the effectiveness of SVM, the most popular classification techniques in classifying medical datasets. This paper analyses the performance of the NaĂŻve Bayes classifier, RBF network and SVM Classifier. The performance of predictive model is analysed with different medical datasets in predicting diseases is recorded and compared. The datasets were of binary class and each dataset had different number of attributes. The datasets include heart datasets, cancer and diabetes datasets. It is observed that SVM classifier produces better percentage of accuracy in classification. The work has been implemented in WEKA environment and obtained results show that SVM is the most robust and effective classifier for medical data sets

    Evidence Fusion using D-S Theory: utilizing a progressively evolving reliability factor in wireless networks

    Get PDF
    The Dempster-Shafer (D-S) theory provides a method to combine evidence from multiple nodes to estimate the likelihood of an intrusion. The theory\u27s rule of combination gives a numerical method to fuse multiple pieces of information to derive a conclusion. But, D-S theory has its shortcomings when used in situations where evidence has significant conflict. Though the observers may have different values of uncertainty in the observed data, D-S theory considers the observers to be equally trustworthy. This thesis introduces a new method of combination based on D-S theory and Consensus method, that takes into consideration the reliability of evidence used in data fusion. The new method\u27s results have been compared against three other methods of evidence fusion to objectively analyze how they perform under Denial of Service attacks and Xmas tree scan attacks

    The Unbalanced Classification Problem: Detecting Breaches in Security

    Get PDF
    This research proposes several methods designed to improve solutions for security classification problems. The security classification problem involves unbalanced, high-dimensional, binary classification problems that are prevalent today. The imbalance within this data involves a significant majority of the negative class and a minority positive class. Any system that needs protection from malicious activity, intruders, theft, or other types of breaches in security must address this problem. These breaches in security are considered instances of the positive class. Given numerical data that represent observations or instances which require classification, state of the art machine learning algorithms can be applied. However, the unbalanced and high-dimensional structure of the data must be considered prior to applying these learning methods. High-dimensional data poses a “curse of dimensionality” which can be overcome through the analysis of subspaces. Exploration of intelligent subspace modeling and the fusion of subspace models is proposed. Detailed analysis of the one-class support vector machine, as well as its weaknesses and proposals to overcome these shortcomings are included. A fundamental method for evaluation of the binary classification model is the receiver operating characteristic (ROC) curve and the area under the curve (AUC). This work details the underlying statistics involved with ROC curves, contributing a comprehensive review of ROC curve construction and analysis techniques to include a novel graphic for illustrating the connection between ROC curves and classifier decision values. The major innovations of this work include synergistic classifier fusion through the analysis of ROC curves and rankings, insight into the statistical behavior of the Gaussian kernel, and novel methods for applying machine learning techniques to defend against computer intrusion detection. The primary empirical vehicle for this research is computer intrusion detection data, and both host-based intrusion detection systems (HIDS) and network-based intrusion detection systems (NIDS) are addressed. Empirical studies also include military tactical scenarios

    Machine Learning Approaches and Web-Based System to the Application of Disease Modifying Therapy for Sickle Cell

    Get PDF
    Sickle cell disease (SCD) is a common serious genetic disease, which has a severe impact due to red blood cell (RBCs) abnormality. According to the World Health Organisation, 7 million newborn babies each year suffer either from the congenital anomaly or from an inherited disease, primarily from thalassemia and sickle cell disease. In the case of SCD, recent research has shown the beneficial effects of a drug called hydroxyurea/hydroxycarbamide in modifying the disease phenotype. The clinical management of this disease-modifying therapy is difficult and time consuming for clinical staff. This includes finding an optimal classifier that can help to solve the issues with missing values, multi-class datasets, and features selection. For the classification and discriminant analysis of SCD datasets, 7 classifiers based on machine learning models are selected representing linear and non-linear methods. After running these classifiers with a single model, the results revealed that a single classifier has provided us with effective outcomes in terms of the classification performance evaluation metric. In order to produce such an optimal outcome, this research proposed and designed combined classifiers (ensemble classifiers) among the neural network’s models, the random forest classifier, and the K-nearest neighbour classifier. In this aspect, combining the levenberg-marquardt algorithm, the voted perceptron classifier, the radial basis neural classifier, and random forest classifier obtain the highest rate of performance and accuracy. This ensemble classifier receives better results during the training set and testing set process. Recent technology advances based on smart devices have improved the medical facilities and become increasingly popular in association with real-time health monitoring and remote/personal health-care. The web-based system developed under the supervision of the haematology specialist at the Alder Hey Children’s Hospital in order to produce such an effective and useful system for both patients and clinicians. To sum up, the simulation experiment concludes that using machine learning and the web-based system platforms represents an alternative procedure that could assist healthcare professionals, particularly for the specialist nurse and junior doctor to improve the quality of care with sickle cell disorder

    Evidence combination in medical data mining

    No full text
    corecore