8,251 research outputs found
Recommended from our members
Checking sequences for distributed test architectures
Controllability and observability problems may manifest themselves during the application of a checking sequence in a test architecture where there are multiple remote testers. These problems often require the use of external coordination message exchanges among testers during testing. However, the use of coordination messages requires the existence of an external network that can increase the cost of testing and can be difficult
to implement. In addition, the use of coordination messages introduces delays and this can cause problems where there are timing constraints. Thus, sometimes it is desired to construct a checking sequence from the specification of the system under test that will be free from controllability and observability problems without requiring the use of external coordination message exchanges. This paper gives conditions under which it is possible to produce such a checking sequence, using multiple distinguishing sequences, and an algorithm that achieves this
Deep Reinforcement Learning for Swarm Systems
Recently, deep reinforcement learning (RL) methods have been applied
successfully to multi-agent scenarios. Typically, these methods rely on a
concatenation of agent states to represent the information content required for
decentralized decision making. However, concatenation scales poorly to swarm
systems with a large number of homogeneous agents as it does not exploit the
fundamental properties inherent to these systems: (i) the agents in the swarm
are interchangeable and (ii) the exact number of agents in the swarm is
irrelevant. Therefore, we propose a new state representation for deep
multi-agent RL based on mean embeddings of distributions. We treat the agents
as samples of a distribution and use the empirical mean embedding as input for
a decentralized policy. We define different feature spaces of the mean
embedding using histograms, radial basis functions and a neural network learned
end-to-end. We evaluate the representation on two well known problems from the
swarm literature (rendezvous and pursuit evasion), in a globally and locally
observable setup. For the local setup we furthermore introduce simple
communication protocols. Of all approaches, the mean embedding representation
using neural network features enables the richest information exchange between
neighboring agents facilitating the development of more complex collective
strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20
Timing and Virtual Observability in Ultimatum Bargaining and "Weak Link" Coordination Games
Previous studies have shown that simply knowing one player moves first can affect behavior in games, even when the first-mover's moves are known to be unobservable. This observation violates the game-theoretic principle that timing of unobserved moves is irrelevant, but is consistent with virtual observability, a theory of how timing can matter without the ability to observe actions. However, this previous research only shows that timing matters in games where knowledge that one player moved first can help select that player's preferred equilibrium, presenting an alternative explanation to virtual observability. We extend this work by varying timing of unobservable moves in ultimatum bargaining games and âweak linkâ coordination games. In the latter, the equilibrium selection explanation does not predict any change in behavior due to timing differences. We find that timing without observability affects behavior in both games, but not substantially
Drift effect and timing without observability: experimental evidence
We provide experimental evidence to Binmore and Samuelsonâs (1999) insights for modeling the learning process through which equilibrium is selected. They proposed the concept of drift to describe the effect of perturbations on the dynamic process leading to equilibrium in evolutionary games with boundedly rational agents. We test within a random matched population two different versions of the Dalek game where the forward induction equilibrium weakly iterately dominates the other Nash equilibrium in pure strategies. We also assume that the first mover makes her decision first (âtimingâ) but the second mover is not informed of the first mover's choice (âlack of observabilityâ). Both players are informed of their position in the sequence and of the fact that the second player will decide without knowing the decision of the first player. If the actual observed choices are only those made by other players in previous interactions, the role played by forward induction is replaced with the learning process taking place within the population. Our results support Binmore and Samuelsonâs model because the frequency of the forward induction outcome is payoff-sensitive: it strongly increases when we impose a slight change in the payoffs that does not change equilibrium predictions. This evidence reinforces the evolutionary nature of the drift effect.evolutionary games, experiments, drift, forward induction, order of play. J.E.L. Classification: C72, C91
- âŠ