16,931 research outputs found

    HANDLING MISSING ATTRIBUTE VALUES IN DECISION TABLES USING VALUED TOLERANCE APPROACH

    Get PDF
    Rule induction is one of the key areas in data mining as it is applied to a large number of real life data. However, in such real life data, the information is incompletely specified most of the time. To induce rules from these incomplete data, more powerful algorithms are necessary. This research work mainly focuses on a probabilistic approach based on the valued tolerance relation. This thesis is divided into two parts. The first part describes the implementation of the valued tolerance relation. The induced rules are then evaluated based on the error rate due to incorrectly classified and unclassified examples. The second part of this research work shows a comparison of the rules induced by the MLEM2 algorithm that has been implemented before, with the rules induced by the valued tolerance based approach which was implemented as part of this research. Hence, through this thesis, the error rate for the MLEM2 algorithm and the valued tolerance based approach are compared and the results are documented

    Determinants of Long-term Economic Development: An Empirical Cross-country Study Involving Rough Sets Theory and Rule Induction

    Get PDF
    Empirical findings on determinants of long-term economic growth are numerous, sometimes inconsistent, highly exciting and still incomplete. The empirical analysis was almost exclusively carried out by standard econometrics. This study compares results gained by cross-country regressions as reported in the literature with those gained by the rough sets theory and rule induction. The main advantages of using rough sets are being able to classify classes and to discretize. Thus, we do not have to deal with distributional, independence, (log-)linearity, and many other assumptions, but can keep the data as they are. The main difference between regression results and rough sets is that most education and human capital indicators can be labeled as robust attributes. In addition, we find that political indicators enter in a non-linear fashion with respect to growth.Economic growth, Rough sets, Rule induction

    Rough-set-based ADR signaling from spontaneous reporting data with missing values

    Get PDF
    AbstractSpontaneous reporting systems of adverse drug events have been widely established in many countries to collect as could as possible all adverse drug events to facilitate the detection of suspected ADR signals via some statistical or data mining methods. Unfortunately, due to privacy concern or other reasons, the reporters sometimes may omit consciously some attributes, causing many missing values existing in the reporting database. Most of research work on ADR detection or methods applied in practice simply adopted listwise deletion to eliminate all data with missing values. Very little work has noticed the possibility and examined the effect of including the missing data in the process of ADR detection.This paper represents our endeavor towards the exploration of this question. We aim at inspecting the feasibility of applying rough set theory to the ADR detection problem. Based on the concept of utilizing characteristic set based approximation to measure the strength of ADR signals, we propose twelve different rough set based measuring methods and show only six of them are feasible for the purpose. Experimental results conducted on the FARES database show that our rough-set-based approach exhibits similar capability in timeline warning of suspicious ADR signals as traditional method with missing deletion, and sometimes can yield noteworthy measures earlier than the traditional method

    A semantical and computational approach to covering-based rough sets

    Get PDF

    Rough set methodology in meta-analysis - a comparative and exploratory analysis

    Get PDF
    We study the applicability of the pattern recognition methodology "rough set data analysis" (RSDA) in the field of meta analysis. We give a summary of the mathematical and statistical background and then proceed to an application of the theory to a meta analysis of empirical studies dealing with the deterrent effect introduced by Becker and Ehrlich. Results are compared with a previously devised meta regression analysis. We find that the RSDA can be used to discover information overlooked by other methods, to preprocess the data for further studying and to strengthen results previously found by other methods.Rough Data Set, RSDA, Meta Analysis, Data Mining, Pattern Recognition, Deterrence, Criminometrics

    Data mining a prostate cancer dataset using rough sets

    Get PDF
    Prostate cancer remains one of the leading causes of cancer death worldwide, with a reported incidence rate of 650,000 cases per annum worldwide. The causal factors of prostate cancer still remain to be determined. In this paper, we investigate a medical dataset containing clinical information on 502 prostate cancer patients using the machine learning technique of rough sets. Our preliminary results yield a classification accuracy of 90%, with high sensitivity and specificity (both at approximately 91%). Our results yield a predictive positive value (PPN) of 81% and a predictive negative value (PNV) of 95%. In addition to the high classification accuracy of our system, the rough set approach also provides a rule-based inference mechanism for information extraction that is suitable for integration into a rule-based system. The generated rules relate directly to the attributes and their values and provide a direct mapping between them
    • …
    corecore