1,992 research outputs found

    Neuroevolutionary learning in nonstationary environments

    Get PDF
    This work presents a new neuro-evolutionary model, called NEVE (Neuroevolutionary Ensemble), based on an ensemble of Multi-Layer Perceptron (MLP) neural networks for learning in nonstationary environments. NEVE makes use of quantum-inspired evolutionary models to automatically configure the ensemble members and combine their output. The quantum-inspired evolutionary models identify the most appropriate topology for each MLP network, select the most relevant input variables, determine the neural network weights and calculate the voting weight of each ensemble member. Four different approaches of NEVE are developed, varying the mechanism for detecting and treating concepts drifts, including proactive drift detection approaches. The proposed models were evaluated in real and artificial datasets, comparing the results obtained with other consolidated models in the literature. The results show that the accuracy of NEVE is higher in most cases and the best configurations are obtained using some mechanism for drift detection. These results reinforce that the neuroevolutionary ensemble approach is a robust choice for situations in which the datasets are subject to sudden changes in behaviour

    Evolving Ensemble Fuzzy Classifier

    Full text link
    The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System

    Machine Learning for Financial Prediction Under Regime Change Using Technical Analysis: A Systematic Review

    Get PDF
    Recent crises, recessions and bubbles have stressed the non-stationary nature and the presence of drastic structural changes in the financial domain. The most recent literature suggests the use of conventional machine learning and statistical approaches in this context. Unfortunately, several of these techniques are unable or slow to adapt to changes in the price-generation process. This study aims to survey the relevant literature on Machine Learning for financial prediction under regime change employing a systematic approach. It reviews key papers with a special emphasis on technical analysis. The study discusses the growing number of contributions that are bridging the gap between two separate communities, one focused on data stream learning and the other on economic research. However, it also makes apparent that we are still in an early stage. The range of machine learning algorithms that have been tested in this domain is very wide, but the results of the study do not suggest that currently there is a specific technique that is clearly dominant

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    Evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification, and classification: A Survey

    Get PDF
    Major assumptions in computational intelligence and machine learning consist of the availability of a historical dataset for model development, and that the resulting model will, to some extent, handle similar instances during its online operation. However, in many real world applications, these assumptions may not hold as the amount of previously available data may be insufficient to represent the underlying system, and the environment and the system may change over time. As the amount of data increases, it is no longer feasible to process data efficiently using iterative algorithms, which typically require multiple passes over the same portions of data. Evolving modeling from data streams has emerged as a framework to address these issues properly by self-adaptation, single-pass learning steps and evolution as well as contraction of model components on demand and on the fly. This survey focuses on evolving fuzzy rule-based models and neuro-fuzzy networks for clustering, classification and regression and system identification in online, real-time environments where learning and model development should be performed incrementally. (C) 2019 Published by Elsevier Inc.Igor Škrjanc, Jose Antonio Iglesias and Araceli Sanchis would like to thank to the Chair of Excellence of Universidad Carlos III de Madrid, and the Bank of Santander Program for their support. Igor Škrjanc is grateful to Slovenian Research Agency with the research program P2-0219, Modeling, simulation and control. Daniel Leite acknowledges the Minas Gerais Foundation for Research and Development (FAPEMIG), process APQ-03384-18. Igor Škrjanc and Edwin Lughofer acknowledges the support by the ”LCM — K2 Center for Symbiotic Mechatronics” within the framework of the Austrian COMET-K2 program. Fernando Gomide is grateful to the Brazilian National Council for Scientific and Technological Development (CNPq) for grant 305906/2014-3

    Nature-Inspired Adaptive Architecture for Soft Sensor Modelling

    Get PDF
    This paper gives a general overview of the challenges present in the research field of Soft Sensor building and proposes a novel architecture for building of Soft Sensors, which copes with the identified challenges. The architecture is inspired and making use of nature-related techniques for computational intelligence. Another aspect, which is addressed by the proposed architecture, are the identified characteristics of the process industry data. The data recorded in the process industry consist usually of certain amount of missing values or sample exceeding meaningful values of the measurements, called data outliers. Other process industry data properties causing problems for the modelling are the collinearity of the data, drifting data and the different sampling rates of the particular hardware sensors. It is these characteristics which are the source of the need for an adaptive behaviour of Soft Sensors. The architecture reflects this need and provides mechanisms for the adaptation and evolution of the Soft Sensor at different levels. The adaptation capabilities are provided by maintaining a variety of rather simple models. These particular models, called paths in terms of the architecture, can for example focus on different partition of the input data space, or provide different adaptation speeds to changes in the data. The actual modelling techniques involved into the architecture are data-driven computational learning approaches like artificial neural networks, principal component regression, etc

    Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams

    Full text link
    The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction (DA3DA^3) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only 25%25\% label proportions. It shows highly competitive performance even if compared with fully supervised learners with 100%100\% label proportions.Comment: This paper has been accepted for publication in Information Science

    DetectA: abrupt concept drift detection in non-stationary environments

    Get PDF
    Almost all drift detection mechanisms designed for classification problems work reactively: after receiving the complete data set (input patterns and class labels) they apply a sequence of procedures to identify some change in the class-conditional distribution – a concept drift. However, detecting changes after its occurrence can be in some situations harmful to the process under analysis. This paper proposes a proactive approach for abrupt drift detection, called DetectA (Detect Abrupt Drift). Briefly, this method is composed of three steps: (i) label the patterns from the test set (an unlabelled data block), using an unsupervised method; (ii) compute some statistics from the train and test sets, conditioned to the given class labels for train set; and (iii) compare the training and testing statistics using a multivariate hypothesis test. Based on the results of the hypothesis tests, we attempt to detect the drift on the test set, before the real labels are obtained. A procedure for creating datasets with abrupt drift has been proposed to perform a sensitivity analysis of the DetectA model. The result of the sensitivity analysis suggests that the detector is efficient and suitable for datasets of high-dimensionality, blocks with any proportion of drifts, and datasets with class imbalance. The performance of the DetectA method, with different configurations, was also evaluated on real and artificial datasets, using an MLP as a classifier. The best results were obtained using one of the detection methods, being the proactive manner a top contender regarding improving the underlying base classifier accuracy

    Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach

    Get PDF
    Triclustering algorithms group sets of coordinates of 3-dimensional datasets. In this paper, a new triclustering approach for data streams is introduced. It follows a streaming scheme of learning in two steps: offline and online phases. First, the offline phase provides a sum mary model with the components of the triclusters. Then, the second stage is the online phase to deal with data in streaming. This online phase consists in using the summary model obtained in the offline stage to update the triclusters as fast as possible with genetic operators. Results using three types of synthetic datasets and a real-world environmental sensor dataset are reported. The performance of the proposed triclustering streaming algo rithm is compared to a batch triclustering algorithm, showing an accurate performance both in terms of quality and running timesMinisterio de Ciencia, Innovación y Universidades TIN2017-88209-C
    corecore