5 research outputs found

    Rough matroids based on coverings

    Full text link
    The introduction of covering-based rough sets has made a substantial contribution to the classical rough sets. However, many vital problems in rough sets, including attribution reduction, are NP-hard and therefore the algorithms for solving them are usually greedy. Matroid, as a generalization of linear independence in vector spaces, it has a variety of applications in many fields such as algorithm design and combinatorial optimization. An excellent introduction to the topic of rough matroids is due to Zhu and Wang. On the basis of their work, we study the rough matroids based on coverings in this paper. First, we investigate some properties of the definable sets with respect to a covering. Specifically, it is interesting that the set of all definable sets with respect to a covering, equipped with the binary relation of inclusion \subseteq, constructs a lattice. Second, we propose the rough matroids based on coverings, which are a generalization of the rough matroids based on relations. Finally, some properties of rough matroids based on coverings are explored. Moreover, an equivalent formulation of rough matroids based on coverings is presented. These interesting and important results exhibit many potential connections between rough sets and matroids.Comment: 15page

    Rough Set Based Rule Evaluations and Their Applications

    Get PDF
    Knowledge discovery is an important process in data analysis, data mining and machine learning. Typically knowledge is presented in the form of rules. However, knowledge discovery systems often generate a huge amount of rules. One of the challenges we face is how to automatically discover interesting and meaningful knowledge from such discovered rules. It is infeasible for human beings to select important and interesting rules manually. How to provide a measure to evaluate the qualities of rules in order to facilitate the understanding of data mining results becomes our focus. In this thesis, we present a series of rule evaluation techniques for the purpose of facilitating the knowledge understanding process. These evaluation techniques help not only to reduce the number of rules, but also to extract higher quality rules. Empirical studies on both artificial data sets and real world data sets demonstrate how such techniques can contribute to practical systems such as ones for medical diagnosis and web personalization. In the first part of this thesis, we discuss several rule evaluation techniques that are proposed towards rule postprocessing. We show how properly defined rule templates can be used as a rule evaluation approach. We propose two rough set based measures, a Rule Importance Measure, and a Rules-As-Attributes Measure, %a measure of considering rules as attributes, to rank the important and interesting rules. In the second part of this thesis, we show how data preprocessing can help with rule evaluation. Because well preprocessed data is essential for important rule generation, we propose a new approach for processing missing attribute values for enhancing the generated rules. In the third part of this thesis, a rough set based rule evaluation system is demonstrated to show the effectiveness of the measures proposed in this thesis. Furthermore, a new user-centric web personalization system is used as a case study to demonstrate how the proposed evaluation measures can be used in an actual application

    Metody stosowania wiedzy dziedzinowej do poprawiania jakości klasyfikatorów

    Get PDF
    The dissertation deals with methods that allow the use of domain knowledge to improve the quality of classifiers, where quality improvement concerns: feature extraction methods, classifier construction methods, and methods for predicting decision values for new objects. In particular the following methods have been proposed to improve the quality of classifiers: the expert features (attributes) defined using domain knowledge expressed in a language that uses the temporal logic, a new method of measuring the quality of cuts during supervised discretization using a matrix of the distances between decision attribute values defined by a domain knowledge, a new decision tree that uses redundant cuts to verify the partition of a tree node, a new method for determination of similarities between objects (e.g. patients) using an ontology defined by an expert with its application to the k-nearest neighbors classifier construction and a new method for generating cross rules describing the effect of a factor interfering perception based on a classifier. All of the aforementioned methods have been implemented in the CommoDM software library, which is one of the RSES-lib library extensions. Implemented methods have been tested on real data sets. These were comparative data sets known from the literature as well as own medical data sets collected during the preparation of the dissertation. The latter data sets are associated with the medical aspect of the dissertation that deals with the support of treatment of patients with stable ischemic heart disease, and the main medical problem considered in the thesis is the problem of predicting the presence of significant coronary artery stenosis based on non-invasive heart monitoring by Holter method. The results of experiments confirm the effectiveness of the application of additional domain knowledge in the task of creating and testing classifiers, because after the application of new methods the quality of classifiers has increased considerably. At the same time, the clinical interpretation of the results is more consistent with medical knowledge. The research has been supported by the grant DEC-2013/09/B/ST6/01568 and the grant DEC-2013/09/B/NZ5/00758, both from the National Science Centre of the Republic of Poland. Their results were published in 10 publications, including 3 publications in journals from the A list of the Polish Ministry of Science and Higher Education, 3 publications indexed in the Web of Science, one chapter in a monograph and 3 post-conference publications
    corecore