533 research outputs found

    Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets

    Get PDF
    In many real application areas, the data used are highly skewed and the number of instances for some classes are much higher than that of the other classes. Solving a classification task using such an imbalanced data-set is difficult due to the bias of the training towards the majority classes. The aim of this paper is to improve the performance of fuzzy rule based classification systems on imbalanced domains, increasing the granularity of the fuzzy partitions on the boundary areas between the classes, in order to obtain a better separability. We propose the use of a hierarchical fuzzy rule based classification system, which is based on the refinement of a simple linguistic fuzzy model by means of the extension of the structure of the knowledge base in a hierarchical way and the use of a genetic rule selection process in order to get a compact and accurate model. The good performance of this approach is shown through an extensive experimental study carried out over a large collection of imbalanced data-sets.Spanish Ministry of Education and Science (MEC) under Projects TIN-2005-08386-C05-01 and TIN-2005-08386- C05-0

    Enhancing multi-class classification in FARC-HD fuzzy classifier: on the synergy between n-dimensional overlap functions and decomposition strategies

    Get PDF
    There are many real-world classification problems involving multiple classes, e.g., in bioinformatics, computer vision or medicine. These problems are generally more difficult than their binary counterparts. In this scenario, decomposition strategies usually improve the performance of classifiers. Hence, in this paper we aim to improve the behaviour of FARC-HD fuzzy classifier in multi-class classification problems using decomposition strategies, and more specifically One-vs-One (OVO) and One-vs-All (OVA) strategies. However, when these strategies are applied on FARC-HD a problem emerges due to the low confidence values provided by the fuzzy reasoning method. This undesirable condition comes from the application of the product t-norm when computing the matching and association degrees, obtaining low values, which are also dependent on the number of antecedents of the fuzzy rules. As a result, robust aggregation strategies in OVO such as the weighted voting obtain poor results with this fuzzy classifier. In order to solve these problems, we propose to adapt the inference system of FARC-HD replacing the product t-norm with overlap functions. To do so, we define n-dimensional overlap functions. The usage of these new functions allows one to obtain more adequate outputs from the base classifiers for the subsequent aggregation in OVO and OVA schemes. Furthermore, we propose a new aggregation strategy for OVO to deal with the problem of the weighted voting derived from the inappropriate confidences provided by FARC-HD for this aggregation method. The quality of our new approach is analyzed using twenty datasets and the conclusions are supported by a proper statistical analysis. In order to check the usefulness of our proposal, we carry out a comparison against some of the state-of-the-art fuzzy classifiers. Experimental results show the competitiveness of our method.This work was supported in part by the Spanish Ministry of Science and Technology under projects TIN2011-28488, TIN-2012-33856 and TIN-2013- 40765-P and the Andalusian Research Plan P10-TIC-6858 and P11-TIC-7765

    Literature Review of the Recent Trends and Applications in various Fuzzy Rule based systems

    Full text link
    Fuzzy rule based systems (FRBSs) is a rule-based system which uses linguistic fuzzy variables as antecedents and consequent to represent human understandable knowledge. They have been applied to various applications and areas throughout the soft computing literature. However, FRBSs suffers from many drawbacks such as uncertainty representation, high number of rules, interpretability loss, high computational time for learning etc. To overcome these issues with FRBSs, there exists many extensions of FRBSs. This paper presents an overview and literature review of recent trends on various types and prominent areas of fuzzy systems (FRBSs) namely genetic fuzzy system (GFS), hierarchical fuzzy system (HFS), neuro fuzzy system (NFS), evolving fuzzy system (eFS), FRBSs for big data, FRBSs for imbalanced data, interpretability in FRBSs and FRBSs which use cluster centroids as fuzzy rules. The review is for years 2010-2021. This paper also highlights important contributions, publication statistics and current trends in the field. The paper also addresses several open research areas which need further attention from the FRBSs research community.Comment: 49 pages, Accepted for publication in ijf

    An insight into imbalanced Big Data classification: outcomes and challenges

    Get PDF
    Big Data applications are emerging during the last years, and researchers from many disciplines are aware of the high advantages related to the knowledge extraction from this type of problem. However, traditional learning approaches cannot be directly applied due to scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-tolerant way to adapt for commodity hardware. Being still a recent discipline, few research has been conducted on imbalanced classification for Big Data. The reasons behind this are mainly the difficulties in adapting standard techniques to the MapReduce programming style. Additionally, inner problems of imbalanced data, namely lack of data and small disjuncts, are accentuated during the data partitioning to fit the MapReduce programming style. This paper is designed under three main pillars. First, to present the first outcomes for imbalanced classification in Big Data problems, introducing the current research state of this area. Second, to analyze the behavior of standard pre-processing techniques in this particular framework. Finally, taking into account the experimental results obtained throughout this work, we will carry out a discussion on the challenges and future directions for the topic.This work has been partially supported by the Spanish Ministry of Science and Technology under Projects TIN2014-57251-P and TIN2015-68454-R, the Andalusian Research Plan P11-TIC-7765, the Foundation BBVA Project 75/2016 BigDaPTOOLS, and the National Science Foundation (NSF) Grant IIS-1447795
    corecore