1,544 research outputs found

    A Compact Evolutionary Interval-Valued Fuzzy Rule-Based Classification System for the Modeling and Prediction of Real-World Financial Applications With Imbalanced Data

    Get PDF
    The current financial crisis has stressed the need to obtain more accurate prediction models in order to decrease risk when investing money on economic opportunities. In addition, the transparency of the process followed to make the decisions in financial applications is becoming an important issue. Furthermore, there is a need to handle real-world imbalanced financial datasets without using sampling techniques that might introduce noise in the used data. In this paper, we present a compact evolutionary interval-valued fuzzy rule-based classification system, which is based on interval-valued fuzzy rule-based classification system with tuning and rule selection (IVTURS FA RC-HD ) for the modeling and prediction of real-world financial applications. This proposed system allows obtaining good prediction accuracies using a small set of short fuzzy rules implying a high degree of interpretability of the generated linguistic model. Furthermore, the proposed system deals with the financial imbalanced datasets with no need for any preprocessing or sampling method and, thus, avoiding the accidental introduction of noise in the data used in the learning process. The system is also provided with a mechanism to handle examples that are not covered by any fuzzy rule in the generated rule base. To test the quality of our proposal, we will present an experimental study including 11 real-world financial datasets. We will show that the proposed system outperforms the original C4.5 decision tree, type-1, and interval-valued fuzzy counterparts that use the synthetic minority oversampling technique (SMOTE) to preprocess data and the original FURIA, which is a fuzzy approximative classifier. Furthermore, the proposed method enhances the results achieved by the cost-sensitive C4.5, and it gives competitive results when compared with FURIA using SMOTE, while our proposal avoids preprocessing techniques, and it provides interpretable models that allow obtaining more accurate results

    A Multi-Agent Architecture for the Design of Hierarchical Interval Type-2 Beta Fuzzy System

    Get PDF
    This paper presents a new methodology for building and evolving hierarchical fuzzy systems. For the system design, a tree-based encoding method is adopted to hierarchically link low dimensional fuzzy systems. Such tree structural representation has by nature a flexible design offering more adjustable and modifiable structures. The proposed hierarchical structure employs a type-2 beta fuzzy system to cope with the faced uncertainties, and the resulting system is called the Hierarchical Interval Type-2 Beta Fuzzy System (HT2BFS). For the system optimization, two main tasks of structure learning and parameter tuning are applied. The structure learning phase aims to evolve and learn the structures of a population of HT2BFS in a multiobjective context taking into account the optimization of both the accuracy and the interpretability metrics. The parameter tuning phase is applied to refine and adjust the parameters of the system. To accomplish these two tasks in the most optimal and faster way, we further employ a multi-agent architecture to provide both a distributed and a cooperative management of the optimization tasks. Agents are divided into two different types based on their functions: a structure agent and a parameter agent. The main function of the structure agent is to perform a multi-objective evolutionary structure learning step by means of the Multi-Objective Immune Programming algorithm (MOIP). The parameter agents have the function of managing different hierarchical structures simultaneously to refine their parameters by means of the Hybrid Harmony Search algorithm (HHS). In this architecture, agents use cooperation and communication concepts to create high-performance HT2BFSs. The performance of the proposed system is evaluated by several comparisons with various state of art approaches on noise-free and noisy time series prediction data sets and regression problems. The results clearly demonstrate a great improvement in the accuracy rate, the convergence speed and the number of used rules as compared with other existing approaches

    A Compact Evolutionary Interval-Valued Fuzzy Rule-Based Classification System for the Modeling and Prediction of Real-World Financial Applications with Imbalanced Data

    Get PDF
    The current financial crisis has stressed the need of obtaining more accurate prediction models in order to decrease the risk when investing money on economic opportunities. In addition, the transparency of the process followed to make the decisions in financial applications is becoming an important issue. Furthermore, there is a need to handle the real-world imbalanced financial data sets without using sampling techniques which might introduce noise in the used data. In this paper, we present a compact evolutionary interval-valued fuzzy rule-based classification system, which is based on IVTURSFARC-HD (Interval-Valued fuzzy rulebased classification system with TUning and Rule Selection) [22]), for the modeling and prediction of real-world financial applications. This proposed system allows obtaining good predictions accuracies using a small set of short fuzzy rules implying a high degree of interpretability of the generated linguistic model. Furthermore, the proposed system deals with the financial imbalanced datasets with no need for any preprocessing or sampling method and thus avoiding the accidental introduction of noise in the data used in the learning process. The system is also provided with a mechanism to handle examples that are not covered by any fuzzy rule in the generated rule base. To test the quality of our proposal, we will present an experimental study including eleven realworld financial datasets. We will show that the proposed system outperforms the original C4.5 decision tree, type-1 and interval-valued fuzzy counterparts which use the SMOTE sampling technique to preprocess data and the original FURIA, which is a fuzzy approximative classifier. Furthermore, the proposed method enhances the results achieved by the cost sensitive C4.5 and it gives competitive results when compared with FURIA using SMOTE, while our proposal avoids pre-processing techniques and it provides interpretable models that allow obtaining more accurate results.Spanish Government TIN2011-28488 TIN2013-40765-

    Automatic synthesis of fuzzy systems: An evolutionary overview with a genetic programming perspective

    Get PDF
    Studies in Evolutionary Fuzzy Systems (EFSs) began in the 90s and have experienced a fast development since then, with applications to areas such as pattern recognition, curve‐fitting and regression, forecasting and control. An EFS results from the combination of a Fuzzy Inference System (FIS) with an Evolutionary Algorithm (EA). This relationship can be established for multiple purposes: fine‐tuning of FIS's parameters, selection of fuzzy rules, learning a rule base or membership functions from scratch, and so forth. Each facet of this relationship creates a strand in the literature, as membership function fine‐tuning, fuzzy rule‐based learning, and so forth and the purpose here is to outline some of what has been done in each aspect. Special focus is given to Genetic Programming‐based EFSs by providing a taxonomy of the main architectures available, as well as by pointing out the gaps that still prevail in the literature. The concluding remarks address some further topics of current research and trends, such as interpretability analysis, multiobjective optimization, and synthesis of a FIS through Evolving methods

    Temporal Information in Data Science: An Integrated Framework and its Applications

    Get PDF
    Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field

    Intelligent network intrusion detection using an evolutionary computation approach

    Get PDF
    With the enormous growth of users\u27 reliance on the Internet, the need for secure and reliable computer networks also increases. Availability of effective automatic tools for carrying out different types of network attacks raises the need for effective intrusion detection systems. Generally, a comprehensive defence mechanism consists of three phases, namely, preparation, detection and reaction. In the preparation phase, network administrators aim to find and fix security vulnerabilities (e.g., insecure protocol and vulnerable computer systems or firewalls), that can be exploited to launch attacks. Although the preparation phase increases the level of security in a network, this will never completely remove the threat of network attacks. A good security mechanism requires an Intrusion Detection System (IDS) in order to monitor security breaches when the prevention schemes in the preparation phase are bypassed. To be able to react to network attacks as fast as possible, an automatic detection system is of paramount importance. The later an attack is detected, the less time network administrators have to update their signatures and reconfigure their detection and remediation systems. An IDS is a tool for monitoring the system with the aim of detecting and alerting intrusive activities in networks. These tools are classified into two major categories of signature-based and anomaly-based. A signature-based IDS stores the signature of known attacks in a database and discovers occurrences of attacks by monitoring and comparing each communication in the network against the database of signatures. On the other hand, mechanisms that deploy anomaly detection have a model of normal behaviour of system and any significant deviation from this model is reported as anomaly. This thesis aims at addressing the major issues in the process of developing signature based IDSs. These are: i) their dependency on experts to create signatures, ii) the complexity of their models, iii) the inflexibility of their models, and iv) their inability to adapt to the changes in the real environment and detect new attacks. To meet the requirements of a good IDS, computational intelligence methods have attracted considerable interest from the research community. This thesis explores a solution to automatically generate compact rulesets for network intrusion detection utilising evolutionary computation techniques. The proposed framework is called ESR-NID (Evolving Statistical Rulesets for Network Intrusion Detection). Using an interval-based structure, this method can be deployed for any continuous-valued input data. Therefore, by choosing appropriate statistical measures (i.e. continuous-valued features) of network trafc as the input to ESRNID, it can effectively detect varied types of attacks since it is not dependent on the signatures of network packets. In ESR-NID, several innovations in the genetic algorithm were developed to keep the ruleset small. A two-stage evaluation component in the evolutionary process takes the cooperation of rules into consideration and results into very compact, easily understood rulesets. The effectiveness of this approach is evaluated against several sources of data for both detection of normal and abnormal behaviour. The results are found to be comparable to those achieved using other machine learning methods from both categories of GA-based and non-GA-based methods. One of the significant advantages of ESR-NIS is that it can be tailored to specific problem domains and the characteristics of the dataset by the use of different fitness and performance functions. This makes the system a more flexible model compared to other learning techniques. Additionally, an IDS must adapt itself to the changing environment with the least amount of configurations. ESR-NID uses an incremental learning approach as new flow of traffic become available. The incremental learning approach benefits from less required storage because it only keeps the generated rules in its database. This is in contrast to the infinitely growing size of repository of raw training data required for traditional learning
    corecore