5,529 research outputs found
Dynamic Metric Learning from Pairwise Comparisons
Recent work in distance metric learning has focused on learning
transformations of data that best align with specified pairwise similarity and
dissimilarity constraints, often supplied by a human observer. The learned
transformations lead to improved retrieval, classification, and clustering
algorithms due to the better adapted distance or similarity measures. Here, we
address the problem of learning these transformations when the underlying
constraint generation process is nonstationary. This nonstationarity can be due
to changes in either the ground-truth clustering used to generate constraints
or changes in the feature subspaces in which the class structure is apparent.
We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD),
a general adaptive, online approach for learning and tracking optimal metrics
as they change over time that is highly robust to a variety of nonstationary
behaviors in the changing metric. We apply the OCELAD framework to an ensemble
of online learners. Specifically, we create a retro-initialized composite
objective mirror descent (COMID) ensemble (RICE) consisting of a set of
parallel COMID learners with different learning rates, demonstrate RICE-OCELAD
on both real and synthetic data sets and show significant performance
improvements relative to previously proposed batch and online distance metric
learning algorithms.Comment: to appear Allerton 2016. arXiv admin note: substantial text overlap
with arXiv:1603.0367
Adaptive Regret Minimization in Bounded-Memory Games
Online learning algorithms that minimize regret provide strong guarantees in
situations that involve repeatedly making decisions in an uncertain
environment, e.g. a driver deciding what route to drive to work every day.
While regret minimization has been extensively studied in repeated games, we
study regret minimization for a richer class of games called bounded memory
games. In each round of a two-player bounded memory-m game, both players
simultaneously play an action, observe an outcome and receive a reward. The
reward may depend on the last m outcomes as well as the actions of the players
in the current round. The standard notion of regret for repeated games is no
longer suitable because actions and rewards can depend on the history of play.
To account for this generality, we introduce the notion of k-adaptive regret,
which compares the reward obtained by playing actions prescribed by the
algorithm against a hypothetical k-adaptive adversary with the reward obtained
by the best expert in hindsight against the same adversary. Roughly, a
hypothetical k-adaptive adversary adapts her strategy to the defender's actions
exactly as the real adversary would within each window of k rounds. Our
definition is parametrized by a set of experts, which can include both fixed
and adaptive defender strategies.
We investigate the inherent complexity of and design algorithms for adaptive
regret minimization in bounded memory games of perfect and imperfect
information. We prove a hardness result showing that, with imperfect
information, any k-adaptive regret minimizing algorithm (with fixed strategies
as experts) must be inefficient unless NP=RP even when playing against an
oblivious adversary. In contrast, for bounded memory games of perfect and
imperfect information we present approximate 0-adaptive regret minimization
algorithms against an oblivious adversary running in time n^{O(1)}.Comment: Full Version. GameSec 2013 (Invited Paper
Distributed Computing with Adaptive Heuristics
We use ideas from distributed computing to study dynamic environments in
which computational nodes, or decision makers, follow adaptive heuristics (Hart
2005), i.e., simple and unsophisticated rules of behavior, e.g., repeatedly
"best replying" to others' actions, and minimizing "regret", that have been
extensively studied in game theory and economics. We explore when convergence
of such simple dynamics to an equilibrium is guaranteed in asynchronous
computational environments, where nodes can act at any time. Our research
agenda, distributed computing with adaptive heuristics, lies on the borderline
of computer science (including distributed computing and learning) and game
theory (including game dynamics and adaptive heuristics). We exhibit a general
non-termination result for a broad class of heuristics with bounded
recall---that is, simple rules of behavior that depend only on recent history
of interaction between nodes. We consider implications of our result across a
wide variety of interesting and timely applications: game theory, circuit
design, social networks, routing and congestion control. We also study the
computational and communication complexity of asynchronous dynamics and present
some basic observations regarding the effects of asynchrony on no-regret
dynamics. We believe that our work opens a new avenue for research in both
distributed computing and game theory.Comment: 36 pages, four figures. Expands both technical results and discussion
of v1. Revised version will appear in the proceedings of Innovations in
Computer Science 201
- …