606 research outputs found

    Automatic synthesis of fuzzy systems: An evolutionary overview with a genetic programming perspective

    Get PDF
    Studies in Evolutionary Fuzzy Systems (EFSs) began in the 90s and have experienced a fast development since then, with applications to areas such as pattern recognition, curve‐fitting and regression, forecasting and control. An EFS results from the combination of a Fuzzy Inference System (FIS) with an Evolutionary Algorithm (EA). This relationship can be established for multiple purposes: fine‐tuning of FIS's parameters, selection of fuzzy rules, learning a rule base or membership functions from scratch, and so forth. Each facet of this relationship creates a strand in the literature, as membership function fine‐tuning, fuzzy rule‐based learning, and so forth and the purpose here is to outline some of what has been done in each aspect. Special focus is given to Genetic Programming‐based EFSs by providing a taxonomy of the main architectures available, as well as by pointing out the gaps that still prevail in the literature. The concluding remarks address some further topics of current research and trends, such as interpretability analysis, multiobjective optimization, and synthesis of a FIS through Evolving methods

    Machine learning for network based intrusion detection: an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data.

    Get PDF
    For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions

    Evolving game theory based decision making systems for NETA power market modelling, analysis and trading strategy development

    Get PDF
    In this thesis, current work carried out on analyzing the strategic behaviours in electricity trading is first reviewed. An intelligent decision-making and support technique, game theory, is often used in the market practice. Game theory is a discipline concerned with how individuals make decisions when they are partly aware of how their action might affect each other and when each individual might take this into account. Deficiencies and limitations of traditional game theory based methods developed for decision-making in electricity trading are also investigated. This research then explores to discover the impact of intelligent systems based trading strategies in the UK power markets. To model these behaviours and the New Electricity Trading Arrangements (NETA) system of the UK, traditional competitive and cooperative game theory strategies are taken into account in the work reported in this thesis. An improved methodology, “trigger price strategy”, is introduced to simulate power generation companies’ enhanced gaming strategies. Such modelling problem is, however, intractable and hence an extra-numerical search technique, Evolutionary Computation, is employed to solve the game theory based system modelling problem. An encoded Genetic Algorithm based technique is developed to search for an effective model for the complex decision-making process and to help decision-makers evaluate their strategies and bidding parameters. A novel and effective electricity trading simulation model is thus developed, where its design features are close to the NETA. The model scale is as close as possible to NETA. A complex and more realistic two-sided transaction mechanism with demand fully incorporated is incorporated in this model. These are a world first in this research area

    30th Anniversary of Applied Intelligence: A combination of bibliometrics and thematic analysis using SciMAT

    Get PDF
    Applied Intelligence is one of the most important international scientific journals in the field of artificial intelligence. From 1991, Applied Intelligence has been oriented to support research advances in new and innovative intelligent systems, methodologies, and their applications in solving real-life complex problems. In this way, Applied Intelligence hosts more than 2,400 publications and achieves around 31,800 citations. Moreover, Applied Intelligence is recognized by the industrial, academic, and scientific communities as a source of the latest innovative and advanced solutions in intelligent manufacturing, privacy-preserving systems, risk analysis, knowledge-based management, modern techniques to improve healthcare systems, methods to assist government, and solving industrial problems that are too complex to be solved through conventional approaches. Bearing in mind that Applied Intelligence celebrates its 30th anniversary in 2021, it is appropriate to analyze its bibliometric performance, conceptual structure, and thematic evolution. To do that, this paper conducts a bibliometric performance and conceptual structure analysis of Applied Intelligence from 1991 to 2020 using SciMAT. Firstly, the performance of the journal is analyzed according to the data retrieved from Scopus, putting the focus on the productivity of the authors, citations, countries, organizations, funding agencies, and most relevant publications. Finally, the conceptual structure of the journal is analyzed with the bibliometric software tool SciMAT, identifying the main thematic areas that have been the object of research and their composition, relationship, and evolution during the period analyzed

    Survey on highly imbalanced multi-class data

    Get PDF
    Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data

    Adaptive Heterogeneous Multi-Population Cultural Algorithm

    Get PDF
    Optimization problems is a class of problems where the goal is to make a system as effective as possible. The goal of this research area is to design an algorithm to solve optimization problems effectively and efficiently. Being effective means that the algorithm should be able to find the optimal solution (or near optimal solutions), while efficiency refers to the computational effort required by the algorithm to find an optimal solution. In other words, an optimization algorithm should be able to find the optimal solution in an acceptable time. Therefore, the aim of this dissertation is to come up with a new algorithm which presents an effective as well as efficient performance. There are various kinds of algorithms proposed to deal with optimization problems. Evolutionary Algorithms (EAs) is a subset of population-based methods which are successfully applied to solve optimization problems. In this dissertation the area of evolutionary methods and specially Cultural Algorithms (CAs) are investigated. The results of this investigation reveal that there are some room for improving the existing EAs. Consequently, a number of EAs are proposed to deal with different optimization problems. The proposed EAs offer better performance compared to the state-of-the-art methods. The main contribution of this dissertation is to introduce a new architecture for optimization algorithms which is called Heterogeneous Multi-Population Cultural Algorithm (HMP-CA). The new architecture first incorporates a decomposition technique to divide the given problem into a number of sub-problems, and then it assigns the sub-problems to different local CAs to be optimized separately in parallel. In order to evaluate the proposed architecture, it is applied on numerical optimization problems. The evaluation results reveal that HMP-CA is fully effective such that it can find the optimal solution for every single run. Furthermore, HMP-CA outperforms the state-of-the-art methods by offering a more efficient performance. The proposed HMP-CA is further improved by incorporating an adaptive decomposition technique. The improved version which is called Adaptive HMP-CA (A-HMP-CA) is evaluated over large scale global optimization problems. The results of this evaluation show that HMP-CA significantly outperforms the state-of-the-art methods in terms of both effectiveness and efficiency

    Machine learning for network based intrusion detection : an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data

    Get PDF
    For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Hybrid approaches to optimization and machine learning methods: a systematic literature review

    Get PDF
    Notably, real problems are increasingly complex and require sophisticated models and algorithms capable of quickly dealing with large data sets and finding optimal solutions. However, there is no perfect method or algorithm; all of them have some limitations that can be mitigated or eliminated by combining the skills of different methodologies. In this way, it is expected to develop hybrid algorithms that can take advantage of the potential and particularities of each method (optimization and machine learning) to integrate methodologies and make them more efficient. This paper presents an extensive systematic and bibliometric literature review on hybrid methods involving optimization and machine learning techniques for clustering and classification. It aims to identify the potential of methods and algorithms to overcome the difficulties of one or both methodologies when combined. After the description of optimization and machine learning methods, a numerical overview of the works published since 1970 is presented. Moreover, an in-depth state-of-art review over the last three years is presented. Furthermore, a SWOT analysis of the ten most cited algorithms of the collected database is performed, investigating the strengths and weaknesses of the pure algorithms and detaching the opportunities and threats that have been explored with hybrid methods. Thus, with this investigation, it was possible to highlight the most notable works and discoveries involving hybrid methods in terms of clustering and classification and also point out the difficulties of the pure methods and algorithms that can be strengthened through the inspirations of other methodologies; they are hybrid methods.Open access funding provided by FCT|FCCN (b-on). This work has been supported by FCT— Fundação para a Ciência e Tecnologia within the R &D Units Project Scope: UIDB/00319/2020. Beatriz Flamia Azevedo is supported by FCT Grant Reference SFRH/BD/07427/2021 The authors are grateful to the Foundation for Science and Technology (FCT, Portugal) for financial support through national funds FCT/ MCTES (PIDDAC) to CeDRI (UIDB/05757/2020 and UIDP/05757/2020) and SusTEC (LA/P/0007/2021).info:eu-repo/semantics/publishedVersio
    corecore