On the Creation of Diverse Ensembles for Nonstationary Environments using Bio-inspired Heuristics

Abstract

Recently the relevance of adaptive models for dynamic data environments has turned into a hot topic due to the vast number of sce- narios generating nonstationary data streams. When a change (concept drift) in data distribution occurs, the ensembles of models trained over these data sources are obsolete and do not adapt suitably to the new distribution of the data. Although most of the research on the field is focused on the detection of this drift to re-train the ensemble, it is widely known the importance of the diversity in the ensemble shortly after the drift in order to reduce the initial drop in accuracy. In a Big Data sce- nario in which data can be huge (and also the number of past models), achieving the most diverse ensemble implies the calculus of all possible combinations of models, which is not an easy task to carry out quickly in the long term. This challenge can be formulated as an optimization prob- lem, for which bio-inspired algorithms can play one of the key roles in these adaptive algorithms. Precisely this is the goal of this manuscript: to validate the relevance of the diversity right after drifts, and to un- veil how to achieve a highly diverse ensemble by using a self-learning optimization technique

    Similar works