775 research outputs found

    A Genetic Tuning to Improve the Performance of Fuzzy Rule-Based Classification Systems with Interval-Valued Fuzzy Sets: Degree of Ignorance and Lateral Position

    Get PDF
    Fuzzy Rule-Based Systems are appropriate tools to deal with classification problems due to their good properties. However, they can suffer a lack of system accuracy as a result of the uncertainty inherent in the definition of the membership functions and the limitation of the homogeneous distribution of the linguistic labels. The aim of the paper is to improve the performance of Fuzzy Rule-Based Classification Systems by means of the Theory of Interval-Valued Fuzzy Sets and a post-processing genetic tuning step. In order to build the Interval-Valued Fuzzy Sets we define a new function called weak ignorance for modeling the uncertainty associated with the definition of the membership functions. Next, we adapt the fuzzy partitions to the problem in an optimal way through a cooperative evolutionary tuning in which we handle both the degree of ignorance and the lateral position (based on the 2-tuples fuzzy linguistic representation) of the linguistic labels. The experimental study is carried out over a large collection of data-sets and it is supported by a statistical analysis. Our results show empirically that the use of our methodology outperforms the initial Fuzzy Rule-Based Classification System. The application of our cooperative tuning enhances the results provided by the use of the isolated tuning approaches and also improves the behavior of the genetic tuning based on the 3-tuples fuzzy linguistic representation.Spanish Government TIN2008-06681-C06-01 TIN2010-1505

    An under-Sampled Approach for Handling Skewed Data Distribution using Cluster Disjuncts

    Get PDF
    In Data mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i.e. examples of one class in a training data set vastly outnumber examples of the other class(es). Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, and feature selection approaches to this problem. In this paper, we present a new hybrid frame work dubbed as Majority Under-sampling based on Cluster Disjunct (MAJOR_CD) for learning from skewed training data. This algorithm provides a simpler and faster alternative by using cluster disjunct concept. We conduct experiments using twelve UCI data sets from various application domains using five algorithms for comparison on six evaluation metrics. The empirical study suggests that MAJOR_CD have been believed to be effective in addressing the class imbalance problem

    Automatic synthesis of fuzzy systems: An evolutionary overview with a genetic programming perspective

    Get PDF
    Studies in Evolutionary Fuzzy Systems (EFSs) began in the 90s and have experienced a fast development since then, with applications to areas such as pattern recognition, curve‐fitting and regression, forecasting and control. An EFS results from the combination of a Fuzzy Inference System (FIS) with an Evolutionary Algorithm (EA). This relationship can be established for multiple purposes: fine‐tuning of FIS's parameters, selection of fuzzy rules, learning a rule base or membership functions from scratch, and so forth. Each facet of this relationship creates a strand in the literature, as membership function fine‐tuning, fuzzy rule‐based learning, and so forth and the purpose here is to outline some of what has been done in each aspect. Special focus is given to Genetic Programming‐based EFSs by providing a taxonomy of the main architectures available, as well as by pointing out the gaps that still prevail in the literature. The concluding remarks address some further topics of current research and trends, such as interpretability analysis, multiobjective optimization, and synthesis of a FIS through Evolving methods

    A single-objective and a multi-objective genetic algorithm to generate accurate and interpretable fuzzy rule based classifiers for the analysis of complex financial data

    Get PDF
    Nowadays, organizations deal with rapidly increasing amount of data that is stored in their databases. It has therefore become of crucial importance for them to identify the necessary patterns in these large databases to turn row data into valuable and actionable information. By exploring these important datasets, the organizations gain competitive advantage against other competitors, based on the assumption that the added value of Knowledge Management Systems strength is first and foremost to facilitate the decision making process. Especially if we consider the importance of knowledge in the 21st century, data mining can be seen as a very effective tool to explore the essential data that foster competitive gain in a changing environment. The overall aim of this study is to design the rule base component of a fuzzy rule-based system (FRBS) through the use of genetic algorithms. The main objective is to generate accurate and interpretable models of the data trying to overcome the existing tradeoff between accuracy and interpretability. We propose two different approaches: an accuracy-driven single-objective genetic algorithm, and a three-objective genetic algorithm that produce a Pareto front approximation, composed of classifiers with different tradeoffs between accuracy and complexity. The proposed approaches have been compared with two other systems, namely a rule selection single-objective algorithm, and a three-objective algorithm. The latter has been developed by the University of Pisa and is able to generate the rule base, while simultaneously learning the definition points of the membership functions, by taking into account both the accuracy and the interpretability of the final model

    Improving the performance of fuzzy rule-based classification systems with interval-valued fuzzy sets and genetic amplitude tuning

    Get PDF
    Among the computational intelligence techniques employed to solve classification problems, Fuzzy Rule-Based Classification Systems (FRBCSs) are a popular tool because of their interpretable models based on linguistic variables, which are easier to understand for the experts or end-users. The aim of this paper is to enhance the performance of FRBCSs by extending the Knowledge Base with the application of the concept of Interval-Valued Fuzzy Sets (IVFSs). We consider a post-processing genetic tuning step that adjusts the amplitude of the upper bound of the IVFS to contextualize the fuzzy partitions and to obtain a most accurate solution to the problem. We analyze the goodness of this approach using two basic and well-known fuzzy rule learning algorithms, the Chi et al.’s method and the fuzzy hybrid genetics-based machine learning algorithm. We show the improvement achieved by this model through an extensive empirical study with a large collection of data-sets.This work has been supported by the Spanish Ministry of Science and Technology under projects TIN2008-06681-C06-01 and TIN2007-65981

    IIVFDT: Ignorance Functions based Interval-Valued Fuzzy Decision Tree with Genetic Tuning

    Get PDF
    The choice of membership functions plays an essential role in the success of fuzzy systems. This is a complex problem due to the possible lack of knowledge when assigning punctual values as membership degrees. To face this handicap, we propose a methodology called Ignorance functions based Interval-Valued Fuzzy Decision Tree with genetic tuning, IIVFDT for short, which allows to improve the performance of fuzzy decision trees by taking into account the ignorance degree. This ignorance degree is the result of a weak ignorance function applied to the punctual value set as membership degree. Our IIVFDT proposal is composed of four steps: (1) the base fuzzy decision tree is generated using the fuzzy ID3 algorithm; (2) the linguistic labels are modeled with Interval-Valued Fuzzy Sets. To do so, a new parametrized construction method of Interval-Valued Fuzzy Sets is defined, whose length represents such ignorance degree; (3) the fuzzy reasoning method is extended to work with this representation of the linguistic terms; (4) an evolutionary tuning step is applied for computing the optimal ignorance degree for each Interval-Valued Fuzzy Set. The experimental study shows that the IIVFDT method allows the results provided by the initial fuzzy ID3 with and without Interval-Valued Fuzzy Sets to be outperformed. The suitability of the proposed methodology is shown with respect to both several state-of-the-art fuzzy decision trees and C4.5. Furthermore, we analyze the quality of our approach versus two methods that learn the fuzzy decision tree using genetic algorithms. Finally, we show that a superior performance can be achieved by means of the positive synergy obtained when applying the well known genetic tuning of the lateral position after the application of the IIVFDT method.Spanish Government TIN2011-28488 TIN2010-1505

    A Compact Evolutionary Interval-Valued Fuzzy Rule-Based Classification System for the Modeling and Prediction of Real-World Financial Applications With Imbalanced Data

    Get PDF
    The current financial crisis has stressed the need to obtain more accurate prediction models in order to decrease risk when investing money on economic opportunities. In addition, the transparency of the process followed to make the decisions in financial applications is becoming an important issue. Furthermore, there is a need to handle real-world imbalanced financial datasets without using sampling techniques that might introduce noise in the used data. In this paper, we present a compact evolutionary interval-valued fuzzy rule-based classification system, which is based on interval-valued fuzzy rule-based classification system with tuning and rule selection (IVTURS FA RC-HD ) for the modeling and prediction of real-world financial applications. This proposed system allows obtaining good prediction accuracies using a small set of short fuzzy rules implying a high degree of interpretability of the generated linguistic model. Furthermore, the proposed system deals with the financial imbalanced datasets with no need for any preprocessing or sampling method and, thus, avoiding the accidental introduction of noise in the data used in the learning process. The system is also provided with a mechanism to handle examples that are not covered by any fuzzy rule in the generated rule base. To test the quality of our proposal, we will present an experimental study including 11 real-world financial datasets. We will show that the proposed system outperforms the original C4.5 decision tree, type-1, and interval-valued fuzzy counterparts that use the synthetic minority oversampling technique (SMOTE) to preprocess data and the original FURIA, which is a fuzzy approximative classifier. Furthermore, the proposed method enhances the results achieved by the cost-sensitive C4.5, and it gives competitive results when compared with FURIA using SMOTE, while our proposal avoids preprocessing techniques, and it provides interpretable models that allow obtaining more accurate results

    A Multi-Tiered Genetic Algorithm for Data Mining and Hypothesis Refinement

    Get PDF
    While there are many approaches to data mining, it seems that there is a hole in the ability to make use of the advantages of multiple techniques. There are many methods that use rigid heuristics and guidelines in constructing rules for data, and are thus limited in their ability to describe patterns. Genetic algorithms provide a more flexible approach, and yet the genetic algorithms that have been employed don't capitalize on the fact that data models have two levels: individual rules and the overall data model. This dissertation introduces a multi-tiered genetic algorithm capable of evolving individual rules and the data model at the same time. The multi-tiered genetic algorithm also provides a means for taking advantage of the strengths of the more rigid methods by using their output as input to the genetic algorithm. Most genetic algorithms use a single "roulette wheel" approach. As such, they are only able to select either good data models or good rules, but are incapable of selecting for both simultaneously. With the additional roulette wheel of the multi-tiered genetic algorithm, the fitness of both rules and data models can be evaluated, enabling the algorithm to select good rules from good data models. This also more closely emulates how genes are passed from parents to children in actual biology. Consequently, this technique strengthens the "genetics" of genetic algorithms. For ease of discussion, the multi-tiered genetic algorithm has been named "Arcanum." This technique was tested on thirteen data sets obtained from The University of California Irvine Knowledge Discovery in Databases Archive. Results for these same data sets were gathered for GAssist, another genetic algorithm designed for data mining, and J4.8, the WEKA implementation of C4.5. While both of the other techniques outperformed Arcanum overall, it was able to provide comparable or better results for 5 of the 13 data sets, indicating that the algorithm can be used for data mining, although it needs improvement. The second stage of testing was on the ability to take results from a previous algorithm and perform refinement on the data model. Initially, Arcanum was used to refine its own data models. Of the six data models used for hypothesis refinement, Arcanum was able to improve upon 3 of them. Next, results from the LEM2 algorithm were used as input to Arcanum. Of the three data models used from LEM2, Arcanum was able to improve upon all three data models by sacrificing accuracy in order to improve coverage, resulting in a better data model overall. The last phase of hypothesis refinement was performed upon C4.5. It required several attempts, each using different parameters, but Arcanum was finally able to make a slight improvement to the C4.5 data model. From the experimental results, Arcanum was shown to yield results comparable to GAssist and C4.5 on some of the data sets. It was also able to take data models from three different techniques and improve upon them. While there is certainly room for improvement of the multi-tiered genetic algorithm described in this dissertation, the experimental evidence supports the claims that it can perform both data mining and hypothesis refinement of data models from other data mining techniques

    An experimental study of hyper-heuristic selection and acceptance mechanism for combinatorial t-way test suite generation

    Get PDF
    Recently, many meta-heuristic algorithms have been proposed to serve as the basis of a t -way test generation strategy (where t indicates the interaction strength) including Genetic Algorithms (GA), Ant Colony Optimization (ACO), Simulated Annealing (SA), Cuckoo Search (CS), Particle Swarm Optimization (PSO), and Harmony Search (HS). Although useful, metaheuristic algorithms that make up these strategies often require specific domain knowledge in order to allow effective tuning before good quality solutions can be obtained. Hyperheuristics provide an alternative methodology to meta-heuristics which permit adaptive selection and/or generation of meta-heuristics automatically during the search process. This paper describes our experience with four hyper-heuristic selection and acceptance mechanisms namely Exponential Monte Carlo with counter (EMCQ), Choice Function (CF), Improvement Selection Rules (ISR), and newly developed Fuzzy Inference Selection (FIS),using the t -way test generation problem as a case study. Based on the experimental results, we offer insights on why each strategy differs in terms of its performance
    corecore