1,363 research outputs found

    Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural Network Classifier

    Get PDF
    The variable selection is an important technique the reducing dimensionality of data frequently used in data preprocessing for performing data mining. This paper presents a new variable selection algorithm uses the heuristic variable selection (HVS) and Minimum Redundancy Maximum Relevance (MRMR). We enhance the HVS method for variab le selection by incorporating (MRMR) filter. Our algorithm is based on wrapper approach using multi-layer perceptron. We called this algorithm a HVS-MRMR Wrapper for variables selection. The relevance of a set of variables is measured by a convex combination of the relevance given by HVS criterion and the MRMR criterion. This approach selects new relevant variables; we evaluate the performance of HVS-MRMR on eight benchmark classification problems. The experimental results show that HVS-MRMR selected a less number of variables with high classification accuracy compared to MRMR and HVS and without variables selection on most datasets. HVS-MRMR can be applied to various classification problems that require high classification accuracy

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    Pharmacogenomics of drug efficacy in the interferon treatment of chronic hepatitis C using classification algorithms

    Get PDF
    Chronic hepatitis C (CHC) patients often stop pursuing interferon-alfa and ribavirin (IFN-alfa/RBV) treatment because of the high cost and associated adverse effects. It is highly desirable, both clinically and economically, to establish tools to distinguish responders from nonresponders and to predict possible outcomes of the IFN-alfa/RBV treatments. Single nucleotide polymorphisms (SNPs) can be used to understand the relationship between genetic inheritance and IFN-alfa/RBV therapeutic response. The aim in this study was to establish a predictive model based on a pharmacogenomic approach. Our study population comprised Taiwanese patients with CHC who were recruited from multiple sites in Taiwan. The genotyping data was generated in the high-throughput genomics lab of Vita Genomics, Inc. With the wrapper-based feature selection approach, we employed multilayer feedforward neural network (MFNN) and logistic regression as a basis for comparisons. Our data revealed that the MFNN models were superior to the logistic regression model. The MFNN approach provides an efficient way to develop a tool for distinguishing responders from nonresponders prior to treatments. Our preliminary results demonstrated that the MFNN algorithm is effective for deriving models for pharmacogenomics studies and for providing the link from clinical factors such as SNPs to the responsiveness of IFN-alfa/RBV in clinical association studies in pharmacogenomics

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
    corecore