Search CORE

13 research outputs found

Selective Sampling with Drift

Author: Crammer Koby
Moroshko Edward
Publication venue
Publication date: 17/02/2014
Field of study

Recently there has been much work on selective sampling, an online active learning setting, in which algorithms work in rounds. On each round an algorithm receives an input and makes a prediction. Then, it can decide whether to query a label, and if so to update its model, otherwise the input is discarded. Most of this work is focused on the stationary case, where it is assumed that there is a fixed target model, and the performance of the algorithm is compared to a fixed model. However, in many real-world applications, such as spam prediction, the best target function may drift over time, or have shifts from time to time. We develop a novel selective sampling algorithm for the drifting setting, analyze it under no assumptions on the mechanism generating the sequence of instances, and derive new mistake bounds that depend on the amount of drift in the problem. Simulations on synthetic and real-world datasets demonstrate the superiority of our algorithms as a selective sampling algorithm in the drifting setting

arXiv.org e-Print Archive

CiteSeerX

Variance Estimation For Dynamic Regression via Spectrum Thresholding

Author: Crammer Koby
Kozdoba Mark
Mannor Shie
Moroshko Edward
Publication venue
Publication date: 01/03/2020
Field of study

We consider the dynamic linear regression problem, where the predictor vector may vary with time. This problem can be modeled as a linear dynamical system, where the parameters that need to be learned are the variance of both the process noise and the observation noise. While variance estimation for dynamic regression is a natural problem, with a variety of applications, existing approaches to this problem either lack guarantees or only have asymptotic guarantees without explicit rates. In addition, all existing approaches rely strongly on Guassianity of the noises. In this paper we study the global system operator: the operator that maps the noise vectors to the output. In particular, we obtain estimates on its spectrum, and as a result derive the first known variance estimators with finite sample complexity guarantees. Moreover, our results hold for arbitrary sub Gaussian distributions of noise terms. We evaluate the approach on synthetic and real-world benchmarks

arXiv.org e-Print Archive

Continual Learning in Linear Classification on Separable Data

Author: Buzaglo Gon
Evron Itay
Khriesh Maroun
Marjieh Badea
Moroshko Edward
Soudry Daniel
Srebro Nathan
Publication venue
Publication date: 06/06/2023
Field of study

We analyze continual learning on a sequence of separable linear classification tasks with binary labels. We show theoretically that learning with weak regularization reduces to solving a sequential max-margin problem, corresponding to a special case of the Projection Onto Convex Sets (POCS) framework. We then develop upper bounds on the forgetting and other quantities of interest under various settings with recurring tasks, including cyclic and random orderings of tasks. We discuss several practical implications to popular training practices like regularization scheduling and weighting. We point out several theoretical differences between our continual classification setting and a recently studied continual regression setting

arXiv.org e-Print Archive