GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for
  Evolving Data Streams

Bonab, Hamed R.; Can, Fazli

research

GOOWE: Geometrically Optimum and Online-Weighted Ensemble Classifier for Evolving Data Streams

Authors: Hamed R. Bonab
Fazli Can
Publication date: 7 September 2017
Publisher
Doi

Abstract

Designing adaptive classifiers for an evolving data stream is a challenging task due to the data size and its dynamically changing nature. Combining individual classifiers in an online setting, the ensemble approach, is a well-known solution. It is possible that a subset of classifiers in the ensemble outperforms others in a time-varying fashion. However, optimum weight assignment for component classifiers is a problem which is not yet fully addressed in online evolving environments. We propose a novel data stream ensemble classifier, called Geometrically Optimum and Online-Weighted Ensemble (GOOWE), which assigns optimum weights to the component classifiers using a sliding window containing the most recent data instances. We map vote scores of individual classifiers and true class labels into a spatial environment. Based on the Euclidean distance between vote scores and ideal-points, and using the linear least squares (LSQ) solution, we present a novel, dynamic, and online weighting approach. While LSQ is used for batch mode ensemble classifiers, it is the first time that we adapt and use it for online environments by providing a spatial modeling of online ensembles. In order to show the robustness of the proposed algorithm, we use real-world datasets and synthetic data generators using the MOA libraries. First, we analyze the impact of our weighting system on prediction accuracy through two scenarios. Second, we compare GOOWE with 8 state-of-the-art ensemble classifiers in a comprehensive experimental environment. Our experiments show that GOOWE provides improved reactions to different types of concept drift compared to our baselines. The statistical tests indicate a significant improvement in accuracy, with conservative time and memory requirements.Comment: 33 Pages, Accepted for publication in The ACM Transactions on Knowledge Discovery from Data (TKDD) in August 201

Similar works

Full text

Available Versions

Crossref

info:doi/10.1145%2F3139240

Last time updated on 01/05/2021

Bilkent University Institutional Repository

oai:repository.bilkent.edu.tr:...

Last time updated on 03/03/2021