204 research outputs found
A Comparative Study of Simple Online Learning Strategies for Streaming Data
Since several years ago, the analysis of data streams has attracted considerably the attention in various
research fields, such as databases systems and data mining. The continuous increase in volume of data and the high
speed that they arrive to the systems challenge the computing systems to store, process and transmit. Furthermore,
it has caused the development of new online learning strategies capable to predict the behavior of the streaming
data. This paper compares three very simple learning methods applied to static data streams when we use the
1-Nearest Neighbor classifier, a linear discriminant, a quadratic classifier, a decision tree, and the Na¨ıve Bayes
classifier. The three strategies have been taken from the literature. One of them includes a time-weighted strategy
to remove obsolete objects from the reference set. The experiments were carried out on twelve real data sets. The
aim of this experimental study is to establish the most suitable online learning model according to the performance
of each classifie
Online Linear Discriminant Analysis for Data Streams with Concept Drift
Various methods based on classical classification methods such as linear discriminant analysis (LDA) have been developed for working on data streams in situations with concept drift. Nevertheless, the updated classifiers of such methods may result in a bad prediction error rate in case the underlying distribution incrementally changes further on. Therefore, we invented a rather general extension to such methods to improve the forecasting quality. Under some assumptions we estimate a model for the time-dependent concept drift that is used to predict the forthcoming distributions of the features. These predictions of distributions are finally used in the LDA to build the classification rule and hence for predicting new observations. In a simulation study we consider different kinds of concept drift and compare the new extended methods with the methods these are based on
Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics
Keyword Spotting (KWS) models on embedded devices should adapt fast to new
user-defined words without forgetting previous ones. Embedded devices have
limited storage and computational resources, thus, they cannot save samples or
update large models. We consider the setup of embedded online continual
learning (EOCL), where KWS models with frozen backbone are trained to
incrementally recognize new words from a non-repeated stream of samples, seen
one at a time. To this end, we propose Temporal Aware Pooling (TAP) which
constructs an enriched feature space computing high-order moments of speech
features extracted by a pre-trained backbone. Our method, TAP-SLDA, updates a
Gaussian model for each class on the enriched feature space to effectively use
audio representations. In experimental analyses, TAP-SLDA outperforms
competitors on several setups, backbones, and baselines, bringing a relative
average gain of 11.3% on the GSC dataset.Comment: INTERSPEECH 202
cPNN: Continuous Progressive Neural Networks for Evolving Streaming Time Series
Dealing with an unbounded data stream involves overcoming the assumption that data is identically distributed and independent. A data stream can, in fact, exhibit temporal dependencies (i.e., be a time series), and data can change distribution over time (concept drift). The two problems are deeply discussed, and existing solutions address them separately: a joint solution is absent. In addition, learning multiple concepts implies remembering the past (a.k.a. avoiding catastrophic forgetting in Neural Networks’ terminology). This work proposes Continuous Progressive Neural Networks (cPNN), a solution that tames concept drifts, handles temporal dependencies, and bypasses catastrophic forgetting. cPNN is a continuous version of Progressive Neural Networks, a methodology for remembering old concepts and transferring past knowledge to fit the new concepts quickly. We base our method on Recurrent Neural Networks and exploit the Stochastic Gradient Descent applied to data streams with temporal dependencies. Results of an ablation study show a quick adaptation of cPNN to new concepts and robustness to drifts
Two Procedures for Robust Monitoring of Probability Distributions of Economic Data Streams induced by Depth Functions
Data streams (streaming data) consist of transiently observed, evolving in
time, multidimensional data sequences that challenge our computational and/or
inferential capabilities. In this paper we propose user friendly approaches for
robust monitoring of selected properties of unconditional and conditional
distribution of the stream basing on depth functions. Our proposals are robust
to a small fraction of outliers and/or inliers but sensitive to a regime change
of the stream at the same time. Their implementations are available in our free
R package DepthProc.Comment: Operations Research and Decisions, vol. 25, No. 1, 201
- …