50,815 research outputs found
Cooperative Online Learning: Keeping your Neighbors Updated
We study an asynchronous online learning setting with a network of agents. At
each time step, some of the agents are activated, requested to make a
prediction, and pay the corresponding loss. The loss function is then revealed
to these agents and also to their neighbors in the network. Our results
characterize how much knowing the network structure affects the regret as a
function of the model of agent activations. When activations are stochastic,
the optimal regret (up to constant factors) is shown to be of order
, where is the horizon and is the independence
number of the network. We prove that the upper bound is achieved even when
agents have no information about the network structure. When activations are
adversarial the situation changes dramatically: if agents ignore the network
structure, a lower bound on the regret can be proven, showing that
learning is impossible. However, when agents can choose to ignore some of their
neighbors based on the knowledge of the network structure, we prove a
sublinear regret bound, where is the clique-covering number of the network
Distributed Online Learning via Cooperative Contextual Bandits
In this paper we propose a novel framework for decentralized, online learning
by many learners. At each moment of time, an instance characterized by a
certain context may arrive to each learner; based on the context, the learner
can select one of its own actions (which gives a reward and provides
information) or request assistance from another learner. In the latter case,
the requester pays a cost and receives the reward but the provider learns the
information. In our framework, learners are modeled as cooperative contextual
bandits. Each learner seeks to maximize the expected reward from its arrivals,
which involves trading off the reward received from its own actions, the
information learned from its own actions, the reward received from the actions
requested of others and the cost paid for these actions - taking into account
what it has learned about the value of assistance from each other learner. We
develop distributed online learning algorithms and provide analytic bounds to
compare the efficiency of these with algorithms with the complete knowledge
(oracle) benchmark (in which the expected reward of every action in every
context is known by every learner). Our estimates show that regret - the loss
incurred by the algorithm - is sublinear in time. Our theoretical framework can
be used in many practical applications including Big Data mining, event
detection in surveillance sensor networks and distributed online recommendation
systems
Random Feature-based Online Multi-kernel Learning in Environments with Unknown Dynamics
Kernel-based methods exhibit well-documented performance in various nonlinear
learning tasks. Most of them rely on a preselected kernel, whose prudent choice
presumes task-specific prior information. Especially when the latter is not
available, multi-kernel learning has gained popularity thanks to its
flexibility in choosing kernels from a prescribed kernel dictionary. Leveraging
the random feature approximation and its recent orthogonality-promoting
variant, the present contribution develops a scalable multi-kernel learning
scheme (termed Raker) to obtain the sought nonlinear learning function `on the
fly,' first for static environments. To further boost performance in dynamic
environments, an adaptive multi-kernel learning scheme (termed AdaRaker) is
developed. AdaRaker accounts not only for data-driven learning of kernel
combination, but also for the unknown dynamics. Performance is analyzed in
terms of both static and dynamic regrets. AdaRaker is uniquely capable of
tracking nonlinear learning functions in environments with unknown dynamics,
and with with analytic performance guarantees. Tests with synthetic and real
datasets are carried out to showcase the effectiveness of the novel algorithms.Comment: 36 page
Active Learning with Expert Advice
Conventional learning with expert advice methods assumes a learner is always
receiving the outcome (e.g., class labels) of every incoming training instance
at the end of each trial. In real applications, acquiring the outcome from
oracle can be costly or time consuming. In this paper, we address a new problem
of active learning with expert advice, where the outcome of an instance is
disclosed only when it is requested by the online learner. Our goal is to learn
an accurate prediction model by asking the oracle the number of questions as
small as possible. To address this challenge, we propose a framework of active
forecasters for online active learning with expert advice, which attempts to
extend two regular forecasters, i.e., Exponentially Weighted Average Forecaster
and Greedy Forecaster, to tackle the task of active learning with expert
advice. We prove that the proposed algorithms satisfy the Hannan consistency
under some proper assumptions, and validate the efficacy of our technique by an
extensive set of experiments.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
- …