17 research outputs found
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
Differential diagnosis of Erythmato-Squamous Diseases using classification and regression tree
Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84 (Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. © 2016 Keivan Maghooli, Mostafa Langarizadeh, Leila Shahmoradi, Mahdi Habibi-koolaee, Mohamad Jebraeily, and Hamid Bouraghi
A Hybrid Deep Learning Approach for Diagnosis of the Erythemato-Squamous Disease
The diagnosis of the Erythemato-squamous disease (ESD) is accepted as a
difficult problem in dermatology. ESD is a form of skin disease. It generally
causes redness of the skin and also may cause loss of skin. They are generally
due to genetic or environmental factors. ESD comprises six classes of skin
conditions namely, pityriasis rubra pilaris, lichen planus, chronic dermatitis,
psoriasis, seboreic dermatitis and pityriasis rosea. The automated diagnosis of
ESD can help doctors and dermatologists in reducing the efforts from their end
and in taking faster decisions for treatment. The literature is replete with
works that used conventional machine learning methods for the diagnosis of ESD.
However, there isn't much instances of application of Deep learning for the
diagnosis of ESD. In this paper, we propose a novel hybrid deep learning
approach i.e. Derm2Vec for the diagnosis of the ESD. Derm2Vec is a hybrid deep
learning model that consists of both Autoencoders and Deep Neural Networks. We
also apply a conventional Deep Neural Network (DNN) for the classification of
ESD. We apply both Derm2Vec and DNN along with other traditional machine
learning methods on a real world dermatology dataset. The Derm2Vec method is
found to be the best performer (when taking the prediction accuracy into
account) followed by DNN and Extreme Gradient Boosting.The mean CV score of
Derm2Vec, DNN and Extreme Gradient Boosting are 96.92 percent, 96.65 percent
and 95.80 percent respectively.Comment: Pre-review version of the paper accepted at the 2020 IEEE
International Conference on Electronics, Computing and Communication
Technologies (CONECCT
Dermatology disease classification via novel evolutionary artificial neural network
Neuro-genetic systems are biologically inspired computational models that use evolutionary algorithms (EAs) in conjunction with neural networks (NNs) to solve problems. They are especially useful in classification problems in which classifier systems are not able to provide easy answers. In this paper a novel neuro-genetic approach is used in order to predict a known classification problem, related to dermatology diseases
Implementing decision tree-based algorithms in medical diagnostic decision support systems
As a branch of healthcare, medical diagnosis can be defined as finding the disease based on the signs and symptoms of the patient. To this end, the required information is gathered from different sources like physical examination, medical history and general information of the patient. Development of smart classification models for medical diagnosis is of great interest amongst the researchers. This is mainly owing to the fact that the machine learning and data mining algorithms are capable of detecting the hidden trends between features of a database. Hence, classifying the medical datasets using smart techniques paves the way to design more efficient medical diagnostic decision support systems.
Several databases have been provided in the literature to investigate different aspects of diseases. As an alternative to the available diagnosis tools/methods, this research involves machine learning algorithms called Classification and Regression Tree (CART), Random Forest (RF) and Extremely Randomized Trees or Extra Trees (ET) for the development of classification models that can be implemented in computer-aided diagnosis systems. As a decision tree (DT), CART is fast to create, and it applies to both the quantitative and qualitative data. For classification problems, RF and ET employ a number of weak learners like CART to develop models for classification tasks.
We employed Wisconsin Breast Cancer Database (WBCD), Z-Alizadeh Sani dataset for coronary artery disease (CAD) and the databanks gathered in Ghaem Hospital’s dermatology clinic for the response of patients having common and/or plantar warts to the cryotherapy and/or immunotherapy methods. To classify the breast cancer type based on the WBCD, the RF and ET methods were employed. It was found that the developed RF and ET models forecast the WBCD type with 100% accuracy in all cases. To choose the proper treatment approach for warts as well as the CAD diagnosis, the CART methodology was employed. The findings of the error analysis revealed that the proposed CART models for the applications of interest attain the highest precision and no literature model can rival it. The outcome of this study supports the idea that methods like CART, RF and ET not only improve the diagnosis precision, but also reduce the time and expense needed to reach a diagnosis. However, since these strategies are highly sensitive to the quality and quantity of the introduced data, more extensive databases with a greater number of independent parameters might be required for further practical implications of the developed models
Recommended from our members
The role of classifiers in feature selection: Number vs nature
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature.
This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions.
The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter.
The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users