72,974 research outputs found
Strategies for prediction under imperfect monitoring
We propose simple randomized strategies for sequential prediction under
imperfect monitoring, that is, when the forecaster does not have access to the
past outcomes but rather to a feedback signal. The proposed strategies are
consistent in the sense that they achieve, asymptotically, the best possible
average reward. It was Rustichini (1999) who first proved the existence of such
consistent predictors. The forecasters presented here offer the first
constructive proof of consistency. Moreover, the proposed algorithms are
computationally efficient. We also establish upper bounds for the rates of
convergence. In the case of deterministic feedback, these rates are optimal up
to logarithmic terms.Comment: Journal version of a COLT conference pape
Dynamic Non-Bayesian Decision Making
The model of a non-Bayesian agent who faces a repeated game with incomplete
information against Nature is an appropriate tool for modeling general
agent-environment interactions. In such a model the environment state
(controlled by Nature) may change arbitrarily, and the feedback/reward function
is initially unknown. The agent is not Bayesian, that is he does not form a
prior probability neither on the state selection strategy of Nature, nor on his
reward function. A policy for the agent is a function which assigns an action
to every history of observations and actions. Two basic feedback structures are
considered. In one of them -- the perfect monitoring case -- the agent is able
to observe the previous environment state as part of his feedback, while in the
other -- the imperfect monitoring case -- all that is available to the agent is
the reward obtained. Both of these settings refer to partially observable
processes, where the current environment state is unknown. Our main result
refers to the competitive ratio criterion in the perfect monitoring case. We
prove the existence of an efficient stochastic policy that ensures that the
competitive ratio is obtained at almost all stages with an arbitrarily high
probability, where efficiency is measured in terms of rate of convergence. It
is further shown that such an optimal policy does not exist in the imperfect
monitoring case. Moreover, it is proved that in the perfect monitoring case
there does not exist a deterministic policy that satisfies our long run
optimality criterion. In addition, we discuss the maxmin criterion and prove
that a deterministic efficient optimal strategy does exist in the imperfect
monitoring case under this criterion. Finally we show that our approach to
long-run optimality can be viewed as qualitative, which distinguishes it from
previous work in this area.Comment: See http://www.jair.org/ for any accompanying file
Recommended from our members
Harnessing enforcement leverage at the border to minimize biological risk from international live species trade
Allocating inspection resources over a diverse set of imports to prevent entry of plant pests and pathogens presents a substantial policy design challenge. We model inspections of live plant imports and producer responses to inspections using a “state-dependent” monitoring and enforcement model. We capture exporter abatement response to a set of feasible inspection policies from the regulator. Conditional on this behavioral response, we solve the regulator’s problem of selecting the parameters for the state-dependent monitoring regime to minimize entry of infested shipments. We account for exporter heterogeneity, fixed penalties for noncompliance, imperfect abatement control and imperfect inspections at the border. Overall, we estimate that state-dependent targeting (based on historical interceptions) cuts the rate of infested shipments that are accepted by one-fifth, relative to uniformly allocated inspections
Recommended from our members
Learning about a Moving Target in Resource Management: Optimal Bayesian Disease Control
Resource managers must often make difficult choices in the face of imperfectly observed and dynamically changing systems (e.g., livestock, fisheries, water, and invasive species). A rich set of techniques exists for identifying optimal choices when that uncertainty is assumed to be understood and irreducible. Standard optimization approaches, however, cannot address situations in which reducible uncertainty applies to either system behavior or environmental states. The adaptive management literature overcomes this limitation with tools for optimal learning, but has been limited to highly simplified models with state and action spaces that are discrete and small. We overcome this problem by using a recently developed extension of the Partially Observable Markov Decision Process (POMDP) framework to allow for learning about a continuous state. We illustrate this methodology by exploring optimal control of bovine tuberculosis in New Zealand cattle. Disease testing—the control variable—serves to identify herds for treatment and provides information on prevalence, which is both imperfectly observed and subject to change due to controllable and uncontrollable factors. We find substantial efficiency losses from both ignoring learning (standard stochastic optimization) and from simplifying system dynamics (to facilitate a typical, simple learning model), though the latter effect dominates in our setting. We also find that under an adaptive management approach, simplifying dynamics can lead to a belief trap in which information gathering ceases, beliefs become increasingly inaccurate, and losses abound
Repeated Multimarket Contact with Private Monitoring: A Belief-Free Approach
This paper studies repeated games where two players play multiple duopolistic
games simultaneously (multimarket contact). A key assumption is that each
player receives a noisy and private signal about the other's actions (private
monitoring or observation errors). There has been no game-theoretic support
that multimarket contact facilitates collusion or not, in the sense that more
collusive equilibria in terms of per-market profits exist than those under a
benchmark case of one market. An equilibrium candidate under the benchmark case
is belief-free strategies. We are the first to construct a non-trivial class of
strategies that exhibits the effect of multimarket contact from the
perspectives of simplicity and mild punishment. Strategies must be simple
because firms in a cartel must coordinate each other with no communication.
Punishment must be mild to an extent that it does not hurt even the minimum
required profits in the cartel. We thus focus on two-state automaton strategies
such that the players are cooperative in at least one market even when he or
she punishes a traitor. Furthermore, we identify an additional condition
(partial indifference), under which the collusive equilibrium yields the
optimal payoff.Comment: Accepted for the 9th Intl. Symp. on Algorithmic Game Theory; An
extended version was accepted at the Thirty-Fourth AAAI Conference on
Artificial Intelligence (AAAI-20
Inflation scares and forecast-based monetary policy
Central banks pay close attention to inflation expectations. In standard models, however, inflation expectations are tied down by the assumption of rational expectations and should be of little independent interest to policy makers. In this paper, the authors relax the assumption of rational expectations with perfect knowledge and reexamine the role of inflation expectations in the economy and in the conduct of monetary policy. Agents are assumed to have imperfect knowledge of the precise structure of the economy and the policymakers' preferences. Expectations are governed by a perpetual learning technology. With learning, disturbances can give rise to endogenous inflation scares, that is, significant and persistent deviations of inflation expectations from those implied by rational expectations. The presence of learning increases the sensitivity of inflation expectations and the term structure of interest rates to economic shocks, in line with the empirical evidence. The authors also explore the role of private inflation expectations for the conduct of efficient monetary policy. Under rational expectations, inflation expectations equal a linear combination of macroeconomic variables and as such provide no additional information to the policy maker. In contrast, under learning, private inflation expectations follow a time-varying process and provide useful information for the conduct of monetary policy.Equilibrium (Economics) ; Monetary policy ; Macroeconomics ; Inflation (Finance) ; Forecasting
Social Memory and Evidence from the Past
Examples of repeated destructive behavior abound throughout the history of human societies. This paper examines the role of social memory --- a society's vicarious beliefs about the past --- in creating and perpetuating destructive conflicts. We examine whether such behavior is consistent with the theory of rational strategic behavior. We analyze an infinite-horizon model in which two countries face off each period in an extended Prisoner's Dilemma game in which an additional possibility of mutually destructive ``all out war'' yields catastrophic consequence for both sides. Each country is inhabited by a dynastic sequence of individuals who care about future individuals in the same country, and can communicate with the next generation of their countrymen using private messages. The two countries' actions in each period also produce physical evidence; a sequence of informative but imperfect public signals that can be observed by all current and future individuals. We find that, provided the future is sufficiently important for all individuals, regardless of the precision of physical evidence from the past there is an equilibrium of the model in which the two countries' social memory is systematically wrong, and in which the two countries engage in all out war with arbitrarily high frequency. Surprisingly, we find that degrading the quality of information that individuals have about current decisions may ``improve'' social memory so that it can no longer be systematically wrong. This in turn ensures that arbitrarily frequent all out wars cannot take place.Social Memory, Private Communication, Dynastic Games, Physical Evidence
- …