694 research outputs found

    A single-objective and a multi-objective genetic algorithm to generate accurate and interpretable fuzzy rule based classifiers for the analysis of complex financial data

    Get PDF
    Nowadays, organizations deal with rapidly increasing amount of data that is stored in their databases. It has therefore become of crucial importance for them to identify the necessary patterns in these large databases to turn row data into valuable and actionable information. By exploring these important datasets, the organizations gain competitive advantage against other competitors, based on the assumption that the added value of Knowledge Management Systems strength is first and foremost to facilitate the decision making process. Especially if we consider the importance of knowledge in the 21st century, data mining can be seen as a very effective tool to explore the essential data that foster competitive gain in a changing environment. The overall aim of this study is to design the rule base component of a fuzzy rule-based system (FRBS) through the use of genetic algorithms. The main objective is to generate accurate and interpretable models of the data trying to overcome the existing tradeoff between accuracy and interpretability. We propose two different approaches: an accuracy-driven single-objective genetic algorithm, and a three-objective genetic algorithm that produce a Pareto front approximation, composed of classifiers with different tradeoffs between accuracy and complexity. The proposed approaches have been compared with two other systems, namely a rule selection single-objective algorithm, and a three-objective algorithm. The latter has been developed by the University of Pisa and is able to generate the rule base, while simultaneously learning the definition points of the membership functions, by taking into account both the accuracy and the interpretability of the final model

    Development of a Self-Learning Approach Applied to Pattern Recognition and Fuzzy Control

    Get PDF
    Systeme auf Basis von Fuzzy-Regeln sind in der Entwicklung der Mustererkennung und Steuersystemen weit verbreitet verwendet. Die meisten aktuellen Methoden des Designs der Fuzzy-Regel-basierte Systeme leiden unter folgenden Problemen 1. Das Verfahren der Fuzzifizierung berücksichtigt weder die statistischen Eigenschaften noch reale Verteilung der betrachteten Daten / Signale nicht. Daher sind die generierten Fuzzy- Zugehörigkeitsfunktionen nicht wirklich in der Lage, diese Daten zu äußern. Darüber hinaus wird der Prozess der Fuzzifizierung manuell definiert. 2. Die ursprüngliche Größe der Regelbasis ist pauschal bestimmt. Diese Feststellung bedeutet, dass dieses Verfahren eine Redundanz in den verwendeten Regeln produzieren kann. Somit wird diese Redundanz zum Auftreten der Probleme von Komplexität und Dimensionalität führen. Der Prozess der Vermeidung dieser Probleme durch das Auswahlverfahren der einschlägigen Regeln kann zum Rechenaufwandsproblem führen. 3. Die Form der Fuzzy-Regel leidet unter dem Problem des Verlusts von Informationen, was wiederum zur Zuschreibung diesen betrachteten Variablen anderen unrealen Bereich führen kann. 4. Ferner wird die Anpassung der Fuzzy- Zugehörigkeitsfunktionen mit den Problemen von Komplexität und Rechenaufwand, wegen der damit verbundenen Iteration und mehrerer Parameter, zugeordnet. Auch wird diese Anpassung im Bereich jeder einzelner Regel realisiert; das heißt, der Anpassungsprozess im Bereich der gesamten Fuzzy-Regelbasis wird nicht durchgeführt

    Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

    Get PDF
    Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

    Hybridizing and applying computational intelligence techniques

    Get PDF
    As computers are increasingly relied upon to perform tasks of increasing complexity affecting many aspects of society, it is imperative that the underlying computational methods performing the tasks have high performance in terms of effectiveness and scalability. A common solution employed to perform such complex tasks are computational intelligence (CI) techniques. CI techniques use approaches influenced by nature to solve problems in which traditional modeling approaches fail due to impracticality, intractability, or mathematical ill-posedness. While CI techniques can perform considerably better than traditional modeling approaches when solving complex problems, the scalability performance of a given CI technique alone is not always optimal. Hybridization is a popular process by which a better performing CI technique is created from the combination of multiple existing techniques in a logical manner. In the first paper in this thesis, a novel hybridization of two CI techniques, accuracy-based learning classifier systems (XCS) and cluster analysis, is presented that improves upon the efficiency and, in some cases, the effectiveness of XCS. A number of tasks in software engineering are performed manually, such as defining expected output in model transformation testing. Especially since the number and size of projects that rely on tasks that must be performed manually, it is critical that automated approaches are employed to reduce or eliminate manual effort from these tasks in order to scale efficiently. The second paper in this thesis details a novel application of a CI technique, multi-objective simulated annealing, to the task of test case model generation to reduce the resulting effort required to manually update expected transformation output --Abstract, page iv

    Computational Intelligence for classification and forecasting of solar photovoltaic energy production and energy consumption in buildings

    Get PDF
    This thesis presents a few novel applications of Computational Intelligence techniques in the field of energy-related problems. More in detail, we refer to the assessment of the energy produced by a solar photovoltaic installation and to the evaluation of building’s energy consumptions. In fact, recently, thanks also to the growing evolution of technologies, the energy sector has drawn the attention of the research community in proposing useful tools to deal with issues of energy efficiency in buildings and with solar energy production management. Thus, we will address two kinds of problem. The first problem is related to the efficient management of solar photovoltaic energy installations, e.g., for efficiently monitoring the performance as well as for finding faults, or for planning the energy distribution in the electrical grid. This problem was faced with two different approaches: a forecasting approach and a fuzzy classification approach for energy production estimation, starting from some knowledge about environmental variables. The forecasting system developed is able to reproduce the instantaneous curve of daily energy produced by the solar panels of the installation, with a forecasting horizon of one day. It combines neural networks and time series analysis models. The fuzzy classification system, rather, extracts some linguistic knowledge about the amount of energy produced by the installation, exploiting an optimal fuzzy rule base and genetic algorithms. The developed model is the result of a novel hierarchical methodology for building fuzzy systems, which may be applied in several areas. The second problem is related to energy efficiency in buildings, for cost reduction and load scheduling purposes, and was tackled by proposing a forecasting system of energy consumption in office buildings. The proposed system exploits a neural network to estimate the energy consumption due to lighting on a time interval of a few hours, starting from considerations on available natural daylight

    Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling

    Get PDF
    This paper aims at providing an in-depth overview of designing interpretable fuzzy inference models from data within a unified framework. The objective of complex system modelling is to develop reliable and understandable models for human being to get insights into complex real-world systems whose first-principle models are unknown. Because system behaviour can be described naturally as a series of linguistic rules, data-driven fuzzy modelling becomes an attractive and widely used paradigm for this purpose. However, fuzzy models constructed from data by adaptive learning algorithms usually suffer from the loss of model interpretability. Model accuracy and interpretability are two conflicting objectives, so interpretation preservation during adaptation in data-driven fuzzy system modelling is a challenging task, which has received much attention in fuzzy system modelling community. In order to clearly discriminate the different roles of fuzzy sets, input variables, and other components in achieving an interpretable fuzzy model, a taxonomy of fuzzy model interpretability is first proposed in terms of low-level interpretability and high-level interpretability in this paper. The low-level interpretability of fuzzy models refers to fuzzy model interpretability achieved by optimizing the membership functions in terms of semantic criteria on fuzzy set level, while the high-level interpretability refers to fuzzy model interpretability obtained by dealing with the coverage, completeness, and consistency of the rules in terms of the criteria on fuzzy rule level. Some criteria for low-level interpretability and high-level interpretability are identified, respectively. Different data-driven fuzzy modelling techniques in the literature focusing on the interpretability issues are reviewed and discussed from the perspective of low-level interpretability and high-level interpretability. Furthermore, some open problems about interpretable fuzzy models are identified and some potential new research directions on fuzzy model interpretability are also suggested. Crown Copyright © 2008

    A Survey of Neural Trees

    Full text link
    Neural networks (NNs) and decision trees (DTs) are both popular models of machine learning, yet coming with mutually exclusive advantages and limitations. To bring the best of the two worlds, a variety of approaches are proposed to integrate NNs and DTs explicitly or implicitly. In this survey, these approaches are organized in a school which we term as neural trees (NTs). This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability. We first propose a thorough taxonomy of NTs that expresses the gradual integration and co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their interpretability and performance, and suggest possible solutions to the remaining challenges. Finally, this survey concludes with a discussion about other considerations like conditional computation and promising directions towards this field. A list of papers reviewed in this survey, along with their corresponding codes, is available at: https://github.com/zju-vipa/awesome-neural-treesComment: 35 pages, 7 figures and 1 tabl

    Temporal Information in Data Science: An Integrated Framework and its Applications

    Get PDF
    Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems

    Meta Heuristics based Machine Learning and Neural Mass Modelling Allied to Brain Machine Interface

    Get PDF
    New understanding of the brain function and increasing availability of low-cost-non-invasive electroencephalograms (EEGs) recording devices have made brain-computer-interface (BCI) as an alternative option to augmentation of human capabilities by providing a new non-muscular channel for sending commands, which could be used to activate electronic or mechanical devices based on modulation of thoughts. In this project, our emphasis will be on how to develop such a BCI using fuzzy rule-based systems (FRBSs), metaheuristics and Neural Mass Models (NMMs). In particular, we treat the BCI system as an integrated problem consisting of mathematical modelling, machine learning and classification. Four main steps are involved in designing a BCI system: 1) data acquisition, 2) feature extraction, 3) classification and 4) transferring the classification outcome into control commands for extended peripheral capability. Our focus has been placed on the first three steps. This research project aims to investigate and develop a novel BCI framework encompassing classification based on machine learning, optimisation and neural mass modelling. The primary aim in this project is to bridge the gap of these three different areas in a bid to design a more reliable and accurate communication path between the brain and external world. To achieve this goal, the following objectives have been investigated: 1) Steady-State Visual Evoked Potential (SSVEP) EEG data are collected from human subjects and pre-processed; 2) Feature extraction procedure is implemented to detect and quantify the characteristics of brain activities which indicates the intention of the subject.; 3) a classification mechanism called an Immune Inspired Multi-Objective Fuzzy Modelling Classification algorithm (IMOFM-C), is adapted as a binary classification approach for classifying binary EEG data. Then, the DDAG-Distance aggregation approach is proposed to aggregate the outcomes of IMOFM-C based binary classifiers for multi-class classification; 4) building on IMOFM-C, a preference-based ensemble classification framework known as IMOFM-CP is proposed to enhance the convergence performance and diversity of each individual component classifier, leading to an improved overall classification accuracy of multi-class EEG data; and 5) finally a robust parameterising approach which combines a single-objective GA and a clustering algorithm with a set of newly devised objective and penalty functions is proposed to obtain robust sets of synaptic connectivity parameters of a thalamic neural mass model (NMM). The parametrisation approach aims to cope with nonlinearity nature normally involved in describing multifarious features of brain signals
    • …
    corecore