11 research outputs found
Adaptive Regret Minimization in Bounded-Memory Games
Online learning algorithms that minimize regret provide strong guarantees in
situations that involve repeatedly making decisions in an uncertain
environment, e.g. a driver deciding what route to drive to work every day.
While regret minimization has been extensively studied in repeated games, we
study regret minimization for a richer class of games called bounded memory
games. In each round of a two-player bounded memory-m game, both players
simultaneously play an action, observe an outcome and receive a reward. The
reward may depend on the last m outcomes as well as the actions of the players
in the current round. The standard notion of regret for repeated games is no
longer suitable because actions and rewards can depend on the history of play.
To account for this generality, we introduce the notion of k-adaptive regret,
which compares the reward obtained by playing actions prescribed by the
algorithm against a hypothetical k-adaptive adversary with the reward obtained
by the best expert in hindsight against the same adversary. Roughly, a
hypothetical k-adaptive adversary adapts her strategy to the defender's actions
exactly as the real adversary would within each window of k rounds. Our
definition is parametrized by a set of experts, which can include both fixed
and adaptive defender strategies.
We investigate the inherent complexity of and design algorithms for adaptive
regret minimization in bounded memory games of perfect and imperfect
information. We prove a hardness result showing that, with imperfect
information, any k-adaptive regret minimizing algorithm (with fixed strategies
as experts) must be inefficient unless NP=RP even when playing against an
oblivious adversary. In contrast, for bounded memory games of perfect and
imperfect information we present approximate 0-adaptive regret minimization
algorithms against an oblivious adversary running in time n^{O(1)}.Comment: Full Version. GameSec 2013 (Invited Paper
Chasing Ghosts: Competing with Stateful Policies
We consider sequential decision making in a setting where regret is measured
with respect to a set of stateful reference policies, and feedback is limited
to observing the rewards of the actions performed (the so called "bandit"
setting). If either the reference policies are stateless rather than stateful,
or the feedback includes the rewards of all actions (the so called "expert"
setting), previous work shows that the optimal regret grows like
in terms of the number of decision rounds .
The difficulty in our setting is that the decision maker unavoidably loses
track of the internal states of the reference policies, and thus cannot
reliably attribute rewards observed in a certain round to any of the reference
policies. In fact, in this setting it is impossible for the algorithm to
estimate which policy gives the highest (or even approximately highest) total
reward. Nevertheless, we design an algorithm that achieves expected regret that
is sublinear in , of the form . Our algorithm is based
on a certain local repetition lemma that may be of independent interest. We
also show that no algorithm can guarantee expected regret better than
Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality
This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic ϵt-switching in the ϵt-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of {ϵt}
Location Privacy Protection in the Mobile Era and Beyond
As interconnected devices become embedded in every aspect of our lives, they accompany
many privacy risks. Location privacy is one notable case, consistently recording an individual’s
location might lead to his/her tracking, fingerprinting and profiling. An individual’s
location privacy can be compromised when tracked by smartphone apps, in indoor spaces,
and/or through Internet of Things (IoT) devices. Recent surveys have indicated that users
genuinely value their location privacy and would like to exercise control over who collects
and processes their location data. They, however, lack the effective and practical tools to
protect their location privacy. An effective location privacy protection mechanism requires
real understanding of the underlying threats, and a practical one requires as little changes to
the existing ecosystems as possible while ensuring psychological acceptability to the users.
This thesis addresses this problem by proposing a suite of effective and practical privacy
preserving mechanisms that address different aspects of real-world location privacy threats.
First, we present LP-Guardian, a comprehensive framework for location privacy protection
for Android smartphone users. LP-Guardian overcomes the shortcomings of existing
approaches by addressing the tracking, profiling, and fingerprinting threats posed by
different mobile apps while maintaining their functionality. LP-Guardian requires modifying
the underlying platform of the mobile operating system, but no changes in either the apps
or service provider. We then propose LP-Doctor, a light-weight user-level tool which allows
Android users to effectively utilize the OS’s location access controls. As opposed to
LP-Guardian, LP-Doctor requires no platform changes. It builds on a two year data collection
campaign in which we analyzed the location privacy threats posed by 1160 apps for
100 users. For the case of indoor location tracking, we present PR-LBS (Privacy vs. Reward
for Location-Based Service), a system that balances the users’ privacy concerns and
the benefits of sharing location data in indoor location tracking environments. PR-LBS
fits within the existing indoor localization ecosystem whether it is infrastructure-based
or device-based. Finally, we target the privacy threats originating from the IoT devices
that employ the emerging Bluetooth Low Energy (BLE) protocol through BLE-Guardian.
BLE-Guardian is a device agnostic system that prevents user tracking and profiling while
securing access to his/her BLE-powered devices. We evaluate BLE-Guardian in real-world
scenarios and demonstrate its effectiveness in protecting the user along with its low overhead
on the user’s devices.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138563/1/kmfawaz_1.pd