98,999 research outputs found

    Online Learning with Switching Costs and Other Adaptive Adversaries

    Full text link
    We study the power of different types of adaptive (nonoblivious) adversaries in the setting of prediction with expert advice, under both full-information and bandit feedback. We measure the player's performance using a new notion of regret, also known as policy regret, which better captures the adversary's adaptiveness to the player's behavior. In a setting where losses are allowed to drift, we characterize ---in a nearly complete manner--- the power of adaptive adversaries with bounded memories and switching costs. In particular, we show that with switching costs, the attainable rate with bandit feedback is Θ~(T2/3)\widetilde{\Theta}(T^{2/3}). Interestingly, this rate is significantly worse than the Θ(T)\Theta(\sqrt{T}) rate attainable with switching costs in the full-information case. Via a novel reduction from experts to bandits, we also show that a bounded memory adversary can force Θ~(T2/3)\widetilde{\Theta}(T^{2/3}) regret even in the full information case, proving that switching costs are easier to control than bounded memory adversaries. Our lower bounds rely on a new stochastic adversary strategy that generates loss processes with strong dependencies

    Adaptive MCMC with online relabeling

    Full text link
    When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal distribution using an online estimate of the covariance matrix of the target, are no exception. To address the label-switching issue, relabeling algorithms associate a permutation to each MCMC sample, trying to obtain reasonable marginals. In the case of adaptive Metropolis (Bernoulli 7 (2001) 223-242), an online relabeling strategy is required. This paper is devoted to the AMOR algorithm, a provably consistent variant of AM that can cope with the label-switching problem. The idea is to nest relabeling steps within the MCMC algorithm based on the estimation of a single covariance matrix that is used both for adapting the covariance of the proposal distribution in the Metropolis algorithm step and for online relabeling. We compare the behavior of AMOR to similar relabeling methods. In the case of compactly supported target distributions, we prove a strong law of large numbers for AMOR and its ergodicity. These are the first results on the consistency of an online relabeling algorithm to our knowledge. The proof underlines latent relations between relabeling and vector quantization.Comment: Published at http://dx.doi.org/10.3150/13-BEJ578 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Adaptive Rational Equilibrium with Forward Looking Agents, fortcoming in International Journal of Economic Theory (IJET) 2006, special issue in honor of Jean-Michel Grandmont.

    Get PDF
    Brock and Hommes (1997) introduce the concept of adaptive rational equilibrium dynamics (ARED)}, where agents choose between a costly rational expectation forecast and a cheap naive forecast, and the fractions using each of the two strategies evolve over time and are endogenously coupled to the market equilibrium price dynamics. In their setting agents are backward looking in the sense that strategy selection is based on experience measured by relative past realized profits. When the selection pressure to switch to the more profitable strategy is high, instability and complicated chaotic price fluctuations arise. In this paper we investigate the ARED with \textit{forward looking} agents, whose strategy selection is based upon expected profits. Our findings suggest that forward looking behavior dampens the amplitude of price fluctuations, but local instability of the steady state remains. The global dynamics depends upon how sophisticated the forward looking behavior is. With perfectly forward looking agents prices converge to a stable 2-cycle, while with forward looking agents who are boundedly rational concerning their estimate of expected profits, small amplitude chaotic price fluctuations may arise. We also establish an equivalence relationship between a heterogeneous agent model with switching of strategies and a representative agent framework, where the representative agent optimally chooses between the benefits of a high quality forecasts and the associated information gathering costs. To an outside observer it is impossible to distinguish between the two.

    Adaptive-smith predictor for controlling an automotive electronic throttle over network

    Get PDF
    The paper presents a control strategy for an automotive electronic throttle, a device used to regulate the power produced by spark-ignition engines. Controlling the electronic throttle body is a difficult task because the throttle accounts strong nonlinearities. The difficulty increases when the control works through communication networks subject to random delay. In this paper, we revisit the Smith-predictor control, and show how to adapt it for controlling the electronic throttle body over a delay-driven network. Experiments were carried out in a laboratory, and the corresponding data indicate the benefits of our approach for applications.Peer ReviewedPostprint (published version

    Adaptive cyclically dominating game on co-evolving networks: Numerical and analytic results

    Full text link
    A co-evolving and adaptive Rock (R)-Paper (P)-Scissors (S) game (ARPS) in which an agent uses one of three cyclically dominating strategies is proposed and studied numerically and analytically. An agent takes adaptive actions to achieve a neighborhood to his advantage by rewiring a dissatisfying link with a probability pp or switching strategy with a probability 1−p1-p. Numerical results revealed two phases in the steady state. An active phase for p<pcrip<p_{\text{cri}} has one connected network of agents using different strategies who are continually interacting and taking adaptive actions. A frozen phase for p>pcrip>p_{\text{cri}} has three separate clusters of agents using only R, P, and S, respectively with terminated adaptive actions. A mean-field theory of link densities in co-evolving network is formulated in a general way that can be readily modified to other co-evolving network problems of multiple strategies. The analytic results agree with simulation results on ARPS well. We point out the different probabilities of winning, losing, and drawing a game among the agents as the origin of the small discrepancy between analytic and simulation results. As a result of the adaptive actions, agents of higher degrees are often those being taken advantage of. Agents with a smaller (larger) degree than the mean degree have a higher (smaller) probability of winning than losing. The results are useful in future attempts on formulating more accurate theories.Comment: 17 pages, 4 figure
    • …
    corecore