Search CORE

446 research outputs found

Modeling concept drift: A probabilistic graphical model based approach

Author: A Bifet
FV Jensen
G Widmer
GF Cooper
J Gama
JC Schlimmer
JM Winn
MI Jordan
MM Gaber
RO Duda
S Zhong
T Hastie
Publication venue
Publication date: 01/01/2015
Field of study

An often used approach for detecting and adapting to concept drift when doing classi cation is to treat the data as i.i.d. and use changes in classi cation accuracy as an indication of concept drift. In this paper, we take a different perspective and propose a framework, based on probabilistic graphical models, that explicitly represents concept drift using latent variables. To ensure effcient inference and learning, we resort to a variational Bayes inference scheme. As a proof of concept, we demonstrate and analyze the proposed framework using synthetic data sets as well as a real fi nancial data set from a Spanish bank

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

VBN

NORA - Norwegian Open Research Archives

Repositorio Institucional de la Universidad de Almería (Spain)

A review of domain adaptation without target labels

Author: Kouw Wouter M.
Loog Marco
Publication venue
Publication date: 01/01/2019
Field of study

Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Online Deep Learning from Doubly-Streaming Data

Author: Atwood John S.
He Yi
Hou Bo-Jian
Lian Heng
Wu Jian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. A plausible idea to deal with such data streams is to establish a relationship between the old and new feature spaces, so that an online learner can leverage the knowledge learned from the old features to better the learning performance on the new features. Unfortunately, this idea does not scale up to high-dimensional multimedia data with complex feature interplay, which suffers a tradeoff between onlineness, which biases shallow learners, and expressiveness, which requires deep models. Motivated by this, we propose a novel OLD3S paradigm, where a shared latent subspace is discovered to summarize information from the old and new feature spaces, building an intermediate feature mapping relationship. A key trait of OLD3S is to treat the model capacity as a learnable semantics, aiming to yield optimal model depth and parameters jointly in accordance with the complexity and non-linearity of the input data streams in an online fashion. Both theoretical analysis and empirical studies substantiate the viability and effectiveness of our proposed approach. The code is available online at https://github.com/X1aoLian/OLD3S

arXiv.org e-Print Archive

Old Dominion University

Ensemble and continual federated learning for classifcation tasks

Author: Barro Ameneiro Senén
Estévez Casado Fernando
Iglesias Rodríguez Roberto
Lema Pais Dylan
Vázquez Regueiro Carlos
Publication venue: Springer
Publication date: 01/01/2023
Field of study

Federated learning is the state-of-the-art paradigm for training a learning model collaboratively across multiple distributed devices while ensuring data privacy. Under this framework, different algorithms have been developed in recent years and have been successfully applied to real use cases. The vast majority of work in federated learning assumes static datasets and relies on the use of deep neural networks. However, in real world problems, it is common to have a continual data stream, which may be non stationary, leading to phenomena such as concept drift. Besides, there are many multi-device applications where other, non-deep strategies are more suitable, due to their simplicity, explainability, or generalizability, among other reasons. In this paper we present Ensemble and Continual Federated Learning, a federated architecture based on ensemble techniques for solving continual classification tasks. We propose the global federated model to be an ensemble, consisting of several independent learners, which are locally trained. Thus, we enable a flexible aggregation of heterogeneous client models, which may differ in size, structure, or even algorithmic family. This ensemble-based approach, together with drift detection and adaptation mechanisms, also allows for continual adaptation in situations where data distribution changes over time. In order to test our proposal and illustrate how it works, we have evaluated it in different tasks related to human activity recognition using smartphonesOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research has received financial support from AEI/FEDER (European Union) Grant Number PID2020-119367RB-I00, as well as the Consellería de Cultura, Educación e Universitade of Galicia (accreditation ED431G-2019/04, ED431G2019/01, and ED431C2018/29), and the European Regional Development Fund (ERDF). It has also been supported by the Ministerio de Universidades of Spain in the FPU 2017 program (FPU17/04154)S

Repositorio da Universidade da Coruña

Repositorio Institucional da Universidade de Santiago de Compostela

Incremental learning algorithms and applications

Author: Gepperth Alexander
Hammer Barbara
Publication venue: HAL CCSD
Publication date: 01/01/2016
Field of study

International audienceIncremental learning refers to learning from streaming data, which arrive over time, with limited memory resources and, ideally, without sacrificing model accuracy. This setting fits different application scenarios where lifelong learning is relevant, e.g. due to changing environments , and it offers an elegant scheme for big data processing by means of its sequential treatment. In this contribution, we formalise the concept of incremental learning, we discuss particular challenges which arise in this setting, and we give an overview about popular approaches, its theoretical foundations, and applications which emerged in the last years

INRIA a CCSD electronic archive server

Novel Methods for Approximate Bayesian Inference of Independent and Evolutionarily Dependent Data

Author: Fearn James A
Publication venue
Publication date: 02/12/2021
Field of study

Explore Bristol Research