18,470 research outputs found

    HANDLING MISSING ATTRIBUTE VALUES IN DECISION TABLES USING VALUED TOLERANCE APPROACH

    Get PDF
    Rule induction is one of the key areas in data mining as it is applied to a large number of real life data. However, in such real life data, the information is incompletely specified most of the time. To induce rules from these incomplete data, more powerful algorithms are necessary. This research work mainly focuses on a probabilistic approach based on the valued tolerance relation. This thesis is divided into two parts. The first part describes the implementation of the valued tolerance relation. The induced rules are then evaluated based on the error rate due to incorrectly classified and unclassified examples. The second part of this research work shows a comparison of the rules induced by the MLEM2 algorithm that has been implemented before, with the rules induced by the valued tolerance based approach which was implemented as part of this research. Hence, through this thesis, the error rate for the MLEM2 algorithm and the valued tolerance based approach are compared and the results are documented

    Determinants of Long-term Economic Development: An Empirical Cross-country Study Involving Rough Sets Theory and Rule Induction

    Get PDF
    Empirical findings on determinants of long-term economic growth are numerous, sometimes inconsistent, highly exciting and still incomplete. The empirical analysis was almost exclusively carried out by standard econometrics. This study compares results gained by cross-country regressions as reported in the literature with those gained by the rough sets theory and rule induction. The main advantages of using rough sets are being able to classify classes and to discretize. Thus, we do not have to deal with distributional, independence, (log-)linearity, and many other assumptions, but can keep the data as they are. The main difference between regression results and rough sets is that most education and human capital indicators can be labeled as robust attributes. In addition, we find that political indicators enter in a non-linear fashion with respect to growth.Economic growth, Rough sets, Rule induction

    Rough-set-based ADR signaling from spontaneous reporting data with missing values

    Get PDF
    AbstractSpontaneous reporting systems of adverse drug events have been widely established in many countries to collect as could as possible all adverse drug events to facilitate the detection of suspected ADR signals via some statistical or data mining methods. Unfortunately, due to privacy concern or other reasons, the reporters sometimes may omit consciously some attributes, causing many missing values existing in the reporting database. Most of research work on ADR detection or methods applied in practice simply adopted listwise deletion to eliminate all data with missing values. Very little work has noticed the possibility and examined the effect of including the missing data in the process of ADR detection.This paper represents our endeavor towards the exploration of this question. We aim at inspecting the feasibility of applying rough set theory to the ADR detection problem. Based on the concept of utilizing characteristic set based approximation to measure the strength of ADR signals, we propose twelve different rough set based measuring methods and show only six of them are feasible for the purpose. Experimental results conducted on the FARES database show that our rough-set-based approach exhibits similar capability in timeline warning of suspicious ADR signals as traditional method with missing deletion, and sometimes can yield noteworthy measures earlier than the traditional method

    A semantical and computational approach to covering-based rough sets

    Get PDF

    Nucleation of Al3Zr and Al3Sc in aluminum alloys: from kinetic Monte Carlo simulations to classical theory

    Get PDF
    Zr and Sc precipitate in aluminum alloys to form the compounds Al3Zr and Al3Sc which for low supersaturations of the solid solution have the L12 structure. The aim of the present study is to model at an atomic scale this kinetics of precipitation and to build a mesoscopic model based on classical nucleation theory so as to extend the field of supersaturations and annealing times that can be simulated. We use some ab-initio calculations and experimental data to fit an Ising model describing thermodynamics of the Al-Zr and Al-Sc systems. Kinetic behavior is described by means of an atom-vacancy exchange mechanism. This allows us to simulate with a kinetic Monte Carlo algorithm kinetics of precipitation of Al3Zr and Al3Sc. These kinetics are then used to test the classical nucleation theory. In this purpose, we deduce from our atomic model an isotropic interface free energy which is consistent with the one deduced from experimental kinetics and a nucleation free energy. We test di erent mean-field approximations (Bragg-Williams approximation as well as Cluster Variation Method) for these parameters. The classical nucleation theory is coherent with the kinetic Monte Carlo simulations only when CVM is used: it manages to reproduce the cluster size distribution in the metastable solid solution and its evolution as well as the steady-state nucleation rate. We also find that the capillary approximation used in the classical nucleation theory works surprisingly well when compared to a direct calculation of the free energy of formation for small L12 clusters.Comment: submitted to Physical Review B (2004

    Rough set methodology in meta-analysis - a comparative and exploratory analysis

    Get PDF
    We study the applicability of the pattern recognition methodology "rough set data analysis" (RSDA) in the field of meta analysis. We give a summary of the mathematical and statistical background and then proceed to an application of the theory to a meta analysis of empirical studies dealing with the deterrent effect introduced by Becker and Ehrlich. Results are compared with a previously devised meta regression analysis. We find that the RSDA can be used to discover information overlooked by other methods, to preprocess the data for further studying and to strengthen results previously found by other methods.Rough Data Set, RSDA, Meta Analysis, Data Mining, Pattern Recognition, Deterrence, Criminometrics
    • …
    corecore