98,999 research outputs found
Online Learning with Switching Costs and Other Adaptive Adversaries
We study the power of different types of adaptive (nonoblivious) adversaries
in the setting of prediction with expert advice, under both full-information
and bandit feedback. We measure the player's performance using a new notion of
regret, also known as policy regret, which better captures the adversary's
adaptiveness to the player's behavior. In a setting where losses are allowed to
drift, we characterize ---in a nearly complete manner--- the power of adaptive
adversaries with bounded memories and switching costs. In particular, we show
that with switching costs, the attainable rate with bandit feedback is
. Interestingly, this rate is significantly worse
than the rate attainable with switching costs in the
full-information case. Via a novel reduction from experts to bandits, we also
show that a bounded memory adversary can force
regret even in the full information case, proving that switching costs are
easier to control than bounded memory adversaries. Our lower bounds rely on a
new stochastic adversary strategy that generates loss processes with strong
dependencies
Adaptive MCMC with online relabeling
When targeting a distribution that is artificially invariant under some
permutations, Markov chain Monte Carlo (MCMC) algorithms face the
label-switching problem, rendering marginal inference particularly cumbersome.
Such a situation arises, for example, in the Bayesian analysis of finite
mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM),
which self-calibrates its proposal distribution using an online estimate of the
covariance matrix of the target, are no exception. To address the
label-switching issue, relabeling algorithms associate a permutation to each
MCMC sample, trying to obtain reasonable marginals. In the case of adaptive
Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is
required. This paper is devoted to the AMOR algorithm, a provably consistent
variant of AM that can cope with the label-switching problem. The idea is to
nest relabeling steps within the MCMC algorithm based on the estimation of a
single covariance matrix that is used both for adapting the covariance of the
proposal distribution in the Metropolis algorithm step and for online
relabeling. We compare the behavior of AMOR to similar relabeling methods. In
the case of compactly supported target distributions, we prove a strong law of
large numbers for AMOR and its ergodicity. These are the first results on the
consistency of an online relabeling algorithm to our knowledge. The proof
underlines latent relations between relabeling and vector quantization.Comment: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Adaptive Rational Equilibrium with Forward Looking Agents, fortcoming in International Journal of Economic Theory (IJET) 2006, special issue in honor of Jean-Michel Grandmont.
Brock and Hommes (1997) introduce the concept of adaptive rational equilibrium dynamics (ARED)}, where agents choose between a costly rational expectation forecast and a cheap naive forecast, and the fractions using each of the two strategies evolve over time and are endogenously coupled to the market equilibrium price dynamics. In their setting agents are backward looking in the sense that strategy selection is based on experience measured by relative past realized profits. When the selection pressure to switch to the more profitable strategy is high, instability and complicated chaotic price fluctuations arise. In this paper we investigate the ARED with \textit{forward looking} agents, whose strategy selection is based upon expected profits. Our findings suggest that forward looking behavior dampens the amplitude of price fluctuations, but local instability of the steady state remains. The global dynamics depends upon how sophisticated the forward looking behavior is. With perfectly forward looking agents prices converge to a stable 2-cycle, while with forward looking agents who are boundedly rational concerning their estimate of expected profits, small amplitude chaotic price fluctuations may arise. We also establish an equivalence relationship between a heterogeneous agent model with switching of strategies and a representative agent framework, where the representative agent optimally chooses between the benefits of a high quality forecasts and the associated information gathering costs. To an outside observer it is impossible to distinguish between the two.
Adaptive-smith predictor for controlling an automotive electronic throttle over network
The paper presents a control strategy for an automotive electronic throttle,
a device used to regulate the power produced by spark-ignition engines. Controlling
the electronic throttle body is a difficult task because the throttle accounts strong
nonlinearities. The difficulty increases when the control works through communication
networks subject to random delay. In this paper, we revisit the Smith-predictor
control, and show how to adapt it for controlling the electronic throttle body over a
delay-driven network. Experiments were carried out in a laboratory, and the corresponding
data indicate the benefits of our approach for applications.Peer ReviewedPostprint (published version
Adaptive cyclically dominating game on co-evolving networks: Numerical and analytic results
A co-evolving and adaptive Rock (R)-Paper (P)-Scissors (S) game (ARPS) in
which an agent uses one of three cyclically dominating strategies is proposed
and studied numerically and analytically. An agent takes adaptive actions to
achieve a neighborhood to his advantage by rewiring a dissatisfying link with a
probability or switching strategy with a probability . Numerical
results revealed two phases in the steady state. An active phase for
has one connected network of agents using different
strategies who are continually interacting and taking adaptive actions. A
frozen phase for has three separate clusters of agents using
only R, P, and S, respectively with terminated adaptive actions. A mean-field
theory of link densities in co-evolving network is formulated in a general way
that can be readily modified to other co-evolving network problems of multiple
strategies. The analytic results agree with simulation results on ARPS well. We
point out the different probabilities of winning, losing, and drawing a game
among the agents as the origin of the small discrepancy between analytic and
simulation results. As a result of the adaptive actions, agents of higher
degrees are often those being taken advantage of. Agents with a smaller
(larger) degree than the mean degree have a higher (smaller) probability of
winning than losing. The results are useful in future attempts on formulating
more accurate theories.Comment: 17 pages, 4 figure
- …