14,409 research outputs found

    Computing fuzzy rough approximations in large scale information systems

    Get PDF
    Rough set theory is a popular and powerful machine learning tool. It is especially suitable for dealing with information systems that exhibit inconsistencies, i.e. objects that have the same values for the conditional attributes but a different value for the decision attribute. In line with the emerging granular computing paradigm, rough set theory groups objects together based on the indiscernibility of their attribute values. Fuzzy rough set theory extends rough set theory to data with continuous attributes, and detects degrees of inconsistency in the data. Key to this is turning the indiscernibility relation into a gradual relation, acknowledging that objects can be similar to a certain extent. In very large datasets with millions of objects, computing the gradual indiscernibility relation (or in other words, the soft granules) is very demanding, both in terms of runtime and in terms of memory. It is however required for the computation of the lower and upper approximations of concepts in the fuzzy rough set analysis pipeline. Current non-distributed implementations in R are limited by memory capacity. For example, we found that a state of the art non-distributed implementation in R could not handle 30,000 rows and 10 attributes on a node with 62GB of memory. This is clearly insufficient to scale fuzzy rough set analysis to massive datasets. In this paper we present a parallel and distributed solution based on Message Passing Interface (MPI) to compute fuzzy rough approximations in very large information systems. Our results show that our parallel approach scales with problem size to information systems with millions of objects. To the best of our knowledge, no other parallel and distributed solutions have been proposed so far in the literature for this problem

    A comprehensive study of implicator-conjunctor based and noise-tolerant fuzzy rough sets: definitions, properties and robustness analysis

    Get PDF
    © 2014 Elsevier B.V. Both rough and fuzzy set theories offer interesting tools for dealing with imperfect data: while the former allows us to work with uncertain and incomplete information, the latter provides a formal setting for vague concepts. The two theories are highly compatible, and since the late 1980s many researchers have studied their hybridization. In this paper, we critically evaluate most relevant fuzzy rough set models proposed in the literature. To this end, we establish a formally correct and unified mathematical framework for them. Both implicator-conjunctor-based definitions and noise-tolerant models are studied. We evaluate these models on two different fronts: firstly, we discuss which properties of the original rough set model can be maintained and secondly, we examine how robust they are against both class and attribute noise. By highlighting the benefits and drawbacks of the different fuzzy rough set models, this study appears a necessary first step to propose and develop new models in future research.Lynn D’eer has been supported by the Ghent University Special Research Fund, Chris Cornelis was partially supported by the Spanish Ministry of Science and Technology under the project TIN2011-28488 and the Andalusian Research Plans P11-TIC-7765 and P10-TIC-6858, and by project PYR-2014-8 of the Genil Program of CEI BioTic GRANADA and Lluis Godo has been partially supported by the Spanish MINECO project EdeTRI TIN2012-39348-C02-01Peer Reviewe

    Autonomous clustering using rough set theory

    Get PDF
    This paper proposes a clustering technique that minimises the need for subjective human intervention and is based on elements of rough set theory. The proposed algorithm is unified in its approach to clustering and makes use of both local and global data properties to obtain clustering solutions. It handles single-type and mixed attribute data sets with ease and results from three data sets of single and mixed attribute types are used to illustrate the technique and establish its efficiency

    Rough set theory applied to pattern recognition of partial discharge in noise affected cable data

    Get PDF
    This paper presents an effective, Rough Set (RS) based, pattern recognition method for rejecting interference signals and recognising Partial Discharge (PD) signals from different sources. Firstly, RS theory is presented in terms of Information System, Lower and Upper Approximation, Signal Discretisation, Attribute Reduction and a flowchart of the RS based pattern recognition method. Secondly, PD testing of five types of artificial defect in ethylene-propylene rubber (EPR) cable is carried out and data pre-processing and feature extraction are employed to separate PD and interference signals. Thirdly, the RS based PD signal recognition method is applied to 4000 samples and is proven to have 99% accuracy. Fourthly, the RS based PD recognition method is applied to signals from five different sources and an accuracy of more than 93% is attained when a combination of signal discretisation and attribute reduction methods are applied. Finally, Back-propagation Neural Network (BPNN) and Support Vector Machine (SVM) methods are studied and compared with the developed method. The proposed RS method is proven to have higher accuracy than SVM and BPNN and can be applied for on-line PD monitoring of cable systems after training with valid sample data

    FEATURE SELECTION APPLIED TO THE TIME-FREQUENCY REPRESENTATION OF MUSCLE NEAR-INFRARED SPECTROSCOPY (NIRS) SIGNALS: CHARACTERIZATION OF DIABETIC OXYGENATION PATTERNS

    Get PDF
    Diabetic patients might present peripheral microcirculation impairment and might benefit from physical training. Thirty-nine diabetic patients underwent the monitoring of the tibialis anterior muscle oxygenation during a series of voluntary ankle flexo-extensions by near-infrared spectroscopy (NIRS). NIRS signals were acquired before and after training protocols. Sixteen control subjects were tested with the same protocol. Time-frequency distributions of the Cohen's class were used to process the NIRS signals relative to the concentration changes of oxygenated and reduced hemoglobin. A total of 24 variables were measured for each subject and the most discriminative were selected by using four feature selection algorithms: QuickReduct, Genetic Rough-Set Attribute Reduction, Ant Rough-Set Attribute Reduction, and traditional ANOVA. Artificial neural networks were used to validate the discriminative power of the selected features. Results showed that different algorithms extracted different sets of variables, but all the combinations were discriminative. The best classification accuracy was about 70%. The oxygenation variables were selected when comparing controls to diabetic patients or diabetic patients before and after training. This preliminary study showed the importance of feature selection techniques in NIRS assessment of diabetic peripheral vascular impairmen

    Determining rules for closing customer service centers: A public utility company's fuzzy decision

    Get PDF
    In the present work, we consider the general problem of knowledge acquisition under uncertainty. A commonly used method is to learn by examples. We observe how the expert solves specific cases and from this infer some rules by which the decision was made. Unique to this work is the fuzzy set representation of the conditions or attributes upon which the decision make may base his fuzzy set decision. From our examples, we infer certain and possible rules containing fuzzy terms. It should be stressed that the procedure determines how closely the expert follows the conditions under consideration in making his decision. We offer two examples pertaining to the possible decision to close a customer service center of a public utility company. In the first example, the decision maker does not follow too closely the conditions. In the second example, the conditions are much more relevant to the decision of the expert
    • …
    corecore