24,732 research outputs found
Global adaptation in networks of selfish components: emergent associative memory at the system scale
In some circumstances complex adaptive systems composed of numerous self-interested agents can self-organise into structures that enhance global adaptation, efficiency or function. However, the general conditions for such an outcome are poorly understood and present a fundamental open question for domains as varied as ecology, sociology, economics, organismic biology and technological infrastructure design. In contrast, sufficient conditions for artificial neural networks to form structures that perform collective computational processes such as associative memory/recall, classification, generalisation and optimisation, are well-understood. Such global functions within a single agent or organism are not wholly surprising since the mechanisms (e.g. Hebbian learning) that create these neural organisations may be selected for this purpose, but agents in a multi-agent system have no obvious reason to adhere to such a structuring protocol or produce such global behaviours when acting from individual self-interest. However, Hebbian learning is actually a very simple and fully-distributed habituation or positive feedback principle. Here we show that when self-interested agents can modify how they are affected by other agents (e.g. when they can influence which other agents they interact with) then, in adapting these inter-agent relationships to maximise their own utility, they will necessarily alter them in a manner homologous with Hebbian learning. Multi-agent systems with adaptable relationships will thereby exhibit the same system-level behaviours as neural networks under Hebbian learning. For example, improved global efficiency in multi-agent systems can be explained by the inherent ability of associative memory to generalise by idealising stored patterns and/or creating new combinations of sub-patterns. Thus distributed multi-agent systems can spontaneously exhibit adaptive global behaviours in the same sense, and by the same mechanism, as the organisational principles familiar in connectionist models of organismic learning
Q-CP: Learning Action Values for Cooperative Planning
Research on multi-robot systems has demonstrated promising results in manifold applications and domains. Still, efficiently learning an effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state dimensionality (e.g. hyper-redundant and groups of robot). To alleviate this problem, we present Q-CP a cooperative model-based reinforcement learning algorithm, which exploits action values to both (1) guide the exploration of the state space and (2) generate effective policies. Specifically, we exploit Q-learning to attack the curse-of-dimensionality in the iterations of a Monte-Carlo Tree Search. We implement and evaluate Q-CP on different stochastic cooperative (general-sum) games: (1) a simple cooperative navigation problem among 3 robots, (2) a cooperation scenario between a pair of KUKA YouBots performing hand-overs, and (3) a coordination task between two mobile robots entering a door. The obtained results show the effectiveness of Q-CP in the chosen applications, where action values drive the exploration and reduce the computational demand of the planning process while achieving good performance
Resilient Learning-Based Control for Synchronization of Passive Multi-Agent Systems under Attack
In this paper, we show synchronization for a group of output passive agents
that communicate with each other according to an underlying communication graph
to achieve a common goal. We propose a distributed event-triggered control
framework that will guarantee synchronization and considerably decrease the
required communication load on the band-limited network. We define a general
Byzantine attack on the event-triggered multi-agent network system and
characterize its negative effects on synchronization. The Byzantine agents are
capable of intelligently falsifying their data and manipulating the underlying
communication graph by altering their respective control feedback weights. We
introduce a decentralized detection framework and analyze its steady-state and
transient performances. We propose a way of identifying individual Byzantine
neighbors and a learning-based method of estimating the attack parameters.
Lastly, we propose learning-based control approaches to mitigate the negative
effects of the adversarial attack
- âŠ