Search CORE

10 research outputs found

Learning with a Drifting Target Concept

Author: D Haussler
DP Helmbold
M-F Balcan
PL Bartlett
PM Long
R El-Yaniv
RD Barve
S Dasgupta
S Hanneke
V Vapnik
Publication venue
Publication date: 19/05/2015
Field of study

We study the problem of learning in the presence of a drifting target concept. Specifically, we provide bounds on the error rate at a given time, given a learner with access to a history of independent samples labeled according to a target concept that can change on each round. One of our main contributions is a refinement of the best previous results for polynomial-time algorithms for the space of linear separators under a uniform distribution. We also provide general results for an algorithm capable of adapting to a variable rate of drift of the target concept. Some of the results also describe an active learning variant of this setting, and provide bounds on the number of queries for the labels of points in the sequence sufficient to obtain the stated bounds on the error rates

arXiv.org e-Print Archive

Crossref

An Adaptive Algorithm for Learning with Unknown Distribution Drift

Author: Mazzetto Alessio
Upfal Eli
Publication venue
Publication date: 08/06/2023
Field of study

We develop and analyze a general technique for learning with an unknown distribution drift. Given a sequence of independent observations from the last

T

steps of a drifting distribution, our algorithm agnostically learns a family of functions with respect to the current distribution at time

T

. Unlike previous work, our technique does not require prior knowledge about the magnitude of the drift. Instead, the algorithm adapts to the sample data. Without explicitly estimating the drift, the algorithm learns a family of functions with almost the same error as a learning algorithm that knows the magnitude of the drift in advance. Furthermore, since our algorithm adapts to the data, it can guarantee a better learning error than an algorithm that relies on loose bounds on the drift.Comment: Fixed typos and references. Updated conclusio

arXiv.org e-Print Archive

Regularization and Optimal Multiclass Learning

Author: Asilis Julian
Devic Siddartha
Dughmi Shaddin
Sharan Vatsal
Teng Shang-Hua
Publication venue
Publication date: 24/09/2023
Field of study

The quintessential learning algorithm of empirical risk minimization (ERM) is known to fail in various settings for which uniform convergence does not characterize learning. It is therefore unsurprising that the practice of machine learning is rife with considerably richer algorithmic techniques for successfully controlling model capacity. Nevertheless, no such technique or principle has broken away from the pack to characterize optimal learning in these more general settings. The purpose of this work is to characterize the role of regularization in perhaps the simplest setting for which ERM fails: multiclass learning with arbitrary label sets. Using one-inclusion graphs (OIGs), we exhibit optimal learning algorithms that dovetail with tried-and-true algorithmic principles: Occam's Razor as embodied by structural risk minimization (SRM), the principle of maximum entropy, and Bayesian reasoning. Most notably, we introduce an optimal learner which relaxes structural risk minimization on two dimensions: it allows the regularization function to be "local" to datapoints, and uses an unsupervised learning stage to learn this regularizer at the outset. We justify these relaxations by showing that they are necessary: removing either dimension fails to yield a near-optimal learner. We also extract from OIGs a combinatorial sequence we term the Hall complexity, which is the first to characterize a problem's transductive error rate exactly. Lastly, we introduce a generalization of OIGs and the transductive learning setting to the agnostic case, where we show that optimal orientations of Hamming graphs -- judged using nodes' outdegrees minus a system of node-dependent credits -- characterize optimal learners exactly. We demonstrate that an agnostic version of the Hall complexity again characterizes error rates exactly, and exhibit an optimal learner using maximum entropy programs.Comment: 40 pages, 2 figure

arXiv.org e-Print Archive

Concept drift learning and its application to adaptive information filtering

Author: Widyantoro Dwi Hendratmo
Publication venue: Texas A&M University
Publication date: 30/09/2004
Field of study

Tracking the evolution of user interests is a problem instance of concept drift learning. Keeping track of multiple interest categories is a natural phenomenon as well as an interesting tracking problem because interests can emerge and diminish at different time frames. The first part of this dissertation presents a Multiple Three-Descriptor Representation (MTDR) algorithm, a novel algorithm for learning concept drift especially built for tracking the dynamics of multiple target concepts in the information filtering domain. The learning process of the algorithm combines the long-term and short-term interest (concept) models in an attempt to benefit from the strength of both models. The MTDR algorithm improves over existing concept drift learning algorithms in the domain. Being able to track multiple target concepts with a few examples poses an even more important and challenging problem because casual users tend to be reluctant to provide the examples needed, and learning from a few labeled data is generally difficult. The second part presents a computational Framework for Extending Incomplete Labeled Data Stream (FEILDS). The system modularly extends the capability of an existing concept drift learner in dealing with incomplete labeled data stream. It expands the learner's original input stream with relevant unlabeled data; the process generates a new stream with improved learnability. FEILDS employs a concept formation system for organizing its input stream into a concept (cluster) hierarchy. The system uses the concept and cluster hierarchy to identify the instance's concept and unlabeled data relevant to a concept. It also adopts the persistence assumption in temporal reasoning for inferring the relevance of concepts. Empirical evaluation indicates that FEILDS is able to improve the performance of existing learners particularly when learning from a stream with a few labeled data. Lastly, a new concept formation algorithm, one of the key components in the FEILDS architecture, is presented. The main idea is to discover intrinsic hierarchical structures regardless of the class distribution and the shape of the input stream. Experimental evaluation shows that the algorithm is relatively robust to input ordering, consistently producing a hierarchy structure of high quality

Texas A&M Repository

Concept drift learning and its application to adaptive information filtering

Author: Widyantoro Dwi Hendratmo
Publication venue: Texas A&M University
Publication date: 30/09/2004
Field of study

Texas A&M Repository

Recommended from our members

Advances in Non-Stationary Sequential Decision-Making

Author: Suk Joseph
Publication venue
Publication date: 01/01/2024
Field of study

We study the problem of sequential decision-making (e.g. multi-armed bandits, contextual bandits, reinforcement learning) under changing environments, or distribution shifts. Ideally, one aims to automatically adapt/self-tune to unknown changes in distribution, and restart exploration as needed. While recent theoretical breakthroughs show this is possible in a broad sense, such works contend that the learner should restart procedures upon experiencing any change leading to worst-case (regret) rates. This leaves open whether faster rates are possible, adaptively, if few changes in distribution are actually severe, e.g., involve no change in best action. This thesis initiates a broad research program giving positive answers to these open questions across several instances. In particular, we begin at non-stationary bandits and show a much weaker notion of change can be adapted to, which can yield significantly faster rates than previously known, whether as expressed in terms of number of best action switches--for which no adaptive procedure was known, or in terms of previously studied variation or smoothness measures. We then generalize these results to non-parametric contextual bandits and dueling bandits. As a result, we substantially improve the theoretical state-of-the-art performance guarantees for these problems and, in many cases, tightly characterize the statistical limits of sequential decision-making under changing environments

Columbia University Academic Commons