1,427 research outputs found
Chasing Ghosts: Competing with Stateful Policies
We consider sequential decision making in a setting where regret is measured
with respect to a set of stateful reference policies, and feedback is limited
to observing the rewards of the actions performed (the so called "bandit"
setting). If either the reference policies are stateless rather than stateful,
or the feedback includes the rewards of all actions (the so called "expert"
setting), previous work shows that the optimal regret grows like
in terms of the number of decision rounds .
The difficulty in our setting is that the decision maker unavoidably loses
track of the internal states of the reference policies, and thus cannot
reliably attribute rewards observed in a certain round to any of the reference
policies. In fact, in this setting it is impossible for the algorithm to
estimate which policy gives the highest (or even approximately highest) total
reward. Nevertheless, we design an algorithm that achieves expected regret that
is sublinear in , of the form . Our algorithm is based
on a certain local repetition lemma that may be of independent interest. We
also show that no algorithm can guarantee expected regret better than
The phase transition in inhomogeneous random graphs
We introduce a very general model of an inhomogenous random graph with
independence between the edges, which scales so that the number of edges is
linear in the number of vertices. This scaling corresponds to the p=c/n scaling
for G(n,p) used to study the phase transition; also, it seems to be a property
of many large real-world graphs. Our model includes as special cases many
models previously studied.
We show that under one very weak assumption (that the expected number of
edges is `what it should be'), many properties of the model can be determined,
in particular the critical point of the phase transition, and the size of the
giant component above the transition. We do this by relating our random graphs
to branching processes, which are much easier to analyze.
We also consider other properties of the model, showing, for example, that
when there is a giant component, it is `stable': for a typical random graph, no
matter how we add or delete o(n) edges, the size of the giant component does
not change by more than o(n).Comment: 135 pages; revised and expanded slightly. To appear in Random
Structures and Algorithm
Bounding Bloat in Genetic Programming
While many optimization problems work with a fixed number of decision
variables and thus a fixed-length representation of possible solutions, genetic
programming (GP) works on variable-length representations. A naturally
occurring problem is that of bloat (unnecessary growth of solutions) slowing
down optimization. Theoretical analyses could so far not bound bloat and
required explicit assumptions on the magnitude of bloat. In this paper we
analyze bloat in mutation-based genetic programming for the two test functions
ORDER and MAJORITY. We overcome previous assumptions on the magnitude of bloat
and give matching or close-to-matching upper and lower bounds for the expected
optimization time. In particular, we show that the (1+1) GP takes (i)
iterations with bloat control on ORDER as well as
MAJORITY; and (ii) and
(and for )
iterations without bloat control on MAJORITY.Comment: An extended abstract has been published at GECCO 201
- …