67,201 research outputs found
Deterministic Graph Exploration with Advice
We consider the task of graph exploration. An -node graph has unlabeled
nodes, and all ports at any node of degree are arbitrarily numbered
. A mobile agent has to visit all nodes and stop. The exploration
time is the number of edge traversals. We consider the problem of how much
knowledge the agent has to have a priori, in order to explore the graph in a
given time, using a deterministic algorithm. This a priori information (advice)
is provided to the agent by an oracle, in the form of a binary string, whose
length is called the size of advice. We consider two types of oracles. The
instance oracle knows the entire instance of the exploration problem, i.e., the
port-numbered map of the graph and the starting node of the agent in this map.
The map oracle knows the port-numbered map of the graph but does not know the
starting node of the agent.
We first consider exploration in polynomial time, and determine the exact
minimum size of advice to achieve it. This size is ,
for both types of oracles.
When advice is large, there are two natural time thresholds:
for a map oracle, and for an instance oracle, that can be achieved
with sufficiently large advice. We show that, with a map oracle, time
cannot be improved in general, regardless of the size of advice.
We also show that the smallest size of advice to achieve this time is larger
than , for any .
For an instance oracle, advice of size is enough to achieve time
. We show that, with any advice of size , the time of
exploration must be at least , for any , and with any
advice of size , the time must be .
We also investigate minimum advice sufficient for fast exploration of
hamiltonian graphs
Online Learning with Feedback Graphs: Beyond Bandits
We study a general class of online learning problems where the feedback is
specified by a graph. This class includes online prediction with expert advice
and the multi-armed bandit problem, but also several learning problems where
the online player does not necessarily observe his own loss. We analyze how the
structure of the feedback graph controls the inherent difficulty of the induced
-round learning problem. Specifically, we show that any feedback graph
belongs to one of three classes: strongly observable graphs, weakly observable
graphs, and unobservable graphs. We prove that the first class induces learning
problems with minimax regret, where
is the independence number of the underlying graph; the second class
induces problems with minimax regret,
where is the domination number of a certain portion of the graph; and
the third class induces problems with linear minimax regret. Our results
subsume much of the previous work on learning with feedback graphs and reveal
new connections to partial monitoring games. We also show how the regret is
affected if the graphs are allowed to vary with time
Topology recognition with advice
In topology recognition, each node of an anonymous network has to
deterministically produce an isomorphic copy of the underlying graph, with all
ports correctly marked. This task is usually unfeasible without any a priori
information. Such information can be provided to nodes as advice. An oracle
knowing the network can give a (possibly different) string of bits to each
node, and all nodes must reconstruct the network using this advice, after a
given number of rounds of communication. During each round each node can
exchange arbitrary messages with all its neighbors and perform arbitrary local
computations. The time of completing topology recognition is the number of
rounds it takes, and the size of advice is the maximum length of a string given
to nodes.
We investigate tradeoffs between the time in which topology recognition is
accomplished and the minimum size of advice that has to be given to nodes. We
provide upper and lower bounds on the minimum size of advice that is sufficient
to perform topology recognition in a given time, in the class of all graphs of
size and diameter , for any constant . In most
cases, our bounds are asymptotically tight
From Bandits to Experts: On the Value of Side-Observations
We consider an adversarial online learning setting where a decision maker can
choose an action in every stage of the game. In addition to observing the
reward of the chosen action, the decision maker gets side observations on the
reward he would have obtained had he chosen some of the other actions. The
observation structure is encoded as a graph, where node i is linked to node j
if sampling i provides information on the reward of j. This setting naturally
interpolates between the well-known "experts" setting, where the decision maker
can view all rewards, and the multi-armed bandits setting, where the decision
maker can only view the reward of the chosen action. We develop practical
algorithms with provable regret guarantees, which depend on non-trivial
graph-theoretic properties of the information feedback structure. We also
provide partially-matching lower bounds.Comment: Presented at the NIPS 2011 conferenc
Universal Learning of Repeated Matrix Games
We study and compare the learning dynamics of two universal learning
algorithms, one based on Bayesian learning and the other on prediction with
expert advice. Both approaches have strong asymptotic performance guarantees.
When confronted with the task of finding good long-term strategies in repeated
2x2 matrix games, they behave quite differently.Comment: 16 LaTeX pages, 8 eps figure
- …