Sliding Reservoir Approach for Delayed Labeling in Streaming Data Classification

Hu, Hanqing; Kantardzic, Mehmed

Sliding Reservoir Approach for Delayed Labeling in Streaming Data Classification

Authors: Hanqing Hu
Mehmed Kantardzic
Publication date: 4 January 2017
Publisher: AIS Electronic Library (AISeL)
Doi

Abstract

When concept drift occurs within streaming data, a streaming data classification framework needs to update the learning model to maintain its performance. Labeled samples required for training a new model are often unavailable immediately in real world applications. This delay of labels might negatively impact the performance of traditional streaming data classification frameworks. To solve this problem, we propose Sliding Reservoir Approach for Delayed Labeling (SRADL). By combining chunk based semi-supervised learning with a novel approach to manage labeled data, SRADL does not need to wait for the labeling process to finish before updating the learning model. Experiments with two delayed-label scenarios show that SRADL improves prediction performance over the naïve approach by as much as 7.5% in certain cases. The most gain comes from 18-chunk labeling delay time with continuous labeling delivery scenario in real world data experiments

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

AIS Electronic Library (AISeL)

oai:aisel.aisnet.org:hicss-50-...

Last time updated on 17/04/2020

ScholarSpace at University of Hawai'i at Manoa

oai:scholarspace.manoa.hawaii....

Last time updated on 19/02/2017