281,601 research outputs found
Adaptation strategies for self-organising electronic institutions
For large-scale systems and networks embedded in highly dynamic, volatile, and unpredictable
environments, self-adaptive and self-organising (SASO) algorithms have been proposed as
solutions to the problems introduced by this dynamism, volatility, and unpredictability. In open
systems it cannot be guaranteed that an adaptive mechanism that works well in isolation will
work well — or at all — in combination with others.
In complexity science the emergence of systemic, or macro-level, properties from individual, or
micro-level, interactions is addressed through mathematical modelling and simulation. Intermediate
meso-level structuration has been proposed as a method for controlling the macro-level
system outcomes, through the study of how the application of certain policies, or norms, can
affect adaptation and organisation at various levels of the system.
In this context, this thesis describes the specification and implementation of an adaptive affective
anticipatory agent model for the individual micro level, and a self-organising distributed institutional
consensus algorithm for the group meso level. Situated in an intelligent transportation
system, the agent model represents an adaptive decision-making system for safe driving, and the
consensus algorithm allows the vehicles to self-organise agreement on values necessary for the
maintenance of “platoons” of vehicles travelling down a motorway. Experiments were performed
using each mechanism in isolation to demonstrate its effectiveness.
A computational testbed has been built on a multi-agent simulator to examine the interaction
between the two given adaptation mechanisms. Experiments involving various differing combinations
of the mechanisms are performed, and the effect of these combinations on the macro-level
system properties is measured. Both beneficial and pernicious interactions are observed; the
experimental results are analysed in an attempt to understand these interactions.
The analysis is performed through a formalism which enables the causes for the various interactions
to be understood. The formalism takes into account the methods by which the SASO
mechanisms are composed, at what level of the system they operate, on which parts of the
system they operate, and how they interact with the population of the system. It is suggested
that this formalism could serve as the starting point for an analytic method and experimental
tools for a future systems theory of adaptation.Open Acces
Signed Networks, Triadic Interactions and the Evolution of Cooperation
We outline a model to study the evolution of cooperation in a population of
agents playing the prisoner's dilemma in signed networks. We highlight that if
only dyadic interactions are taken into account, cooperation never evolves.
However, when triadic considerations are introduced, a window of opportunity
for emergence of cooperation as a stable behaviour emerges.Comment: In Proceedings Wivace 2013, arXiv:1309.712
Modelling Social Structures and Hierarchies in Language Evolution
Language evolution might have preferred certain prior social configurations
over others. Experiments conducted with models of different social structures
(varying subgroup interactions and the role of a dominant interlocutor) suggest
that having isolated agent groups rather than an interconnected agent is more
advantageous for the emergence of a social communication system. Distinctive
groups that are closely connected by communication yield systems less like
natural language than fully isolated groups inhabiting the same world.
Furthermore, the addition of a dominant male who is asymmetrically favoured as
a hearer, and equally likely to be a speaker has no positive influence on the
disjoint groups.Comment: 14 pages, 3 figures, 1 table. In proceedings of AI-2010, The
Thirtieth SGAI International Conference on Innovative Techniques and
Applications of Artificial Intelligence, Cambridge, England, UK, 14-16
December 201
Modeling Mutual Influence in Multi-Agent Reinforcement Learning
In multi-agent systems (MAS), agents rarely act in isolation but tend to achieve their goals through interactions with other agents. To be able to achieve their ultimate goals, individual agents should actively evaluate the impacts on themselves of other agents' behaviors before they decide which actions to take. The impacts are reciprocal, and it is of great interest to model the mutual influence of agent's impacts with one another when they are observing the environment or taking actions in the environment. In this thesis, assuming that the agents are aware of each other's existence and their potential impact on themselves, I develop novel multi-agent reinforcement learning (MARL) methods that can measure the mutual influence between agents to shape learning. The first part of this thesis outlines the framework of recursive reasoning in deep multi-agent reinforcement learning. I hypothesize that it is beneficial for each agent to consider how other agents react to their behavior. I start from Probabilistic Recursive Reasoning (PR2) using level-1 reasoning and adopt variational Bayes methods to approximate the opponents' conditional policies. Each agent shapes the individual Q-value by marginalizing the conditional policies in the joint Q-value and finding the best response to improving their policies. I further extend PR2 to Generalized Recursive Reasoning (GR2) with different hierarchical levels of rationality. GR2 enables agents to possess various levels of thinking ability, thereby allowing higher-level agents to best respond to less sophisticated learners. The first part of the thesis shows that eliminating the joint Q-value to an individual Q-value via explicitly recursive reasoning would benefit the learning. In the second part of the thesis, in reverse, I measure the mutual influence by approximating the joint Q-value based on the individual Q-values. I establish Q-DPP, an extension of the Determinantal Point Process (DPP) with partition constraints, and apply it to multi-agent learning as a function approximator for the centralized value function. An attractive property of using Q-DPP is that when it reaches the optimum value, it can offer a natural factorization of the centralized value function, representing both quality (maximizing reward) and diversity (different behaviors). In the third part of the thesis, I depart from the action-level mutual influence and build a policy-space meta-game to analyze agents' relationship between adaptive policies. I present a Multi-Agent Trust Region Learning (MATRL) algorithm that augments single-agent trust region policy optimization with a weak stable fixed point approximated by the policy-space meta-game. The algorithm aims to find a game-theoretic mechanism to adjust the policy optimization steps that force the learning of all agents toward the stable point
A model of multi-agent consensus for vague and uncertain beliefs
Consensus formation is investigated for multi-agent systems in which agents’ beliefs are both vague and uncertain. Vagueness is represented by a third truth state meaning borderline. This is combined with a probabilistic model of uncertainty. A belief combination operator is then proposed, which exploits borderline truth values to enable agents with conflicting beliefs to reach a compromise. A number of simulation experiments are carried out, in which agents apply this operator in pairwise interactions, under the bounded confidence restriction that the two agents’ beliefs must be sufficiently consistent with each other before agreement can be reached. As well as studying the consensus operator in isolation, we also investigate scenarios in which agents are influenced either directly or indirectly by the state of the world. For the former, we conduct simulations that combine consensus formation with belief updating based on evidence. For the latter, we investigate the effect of assuming that the closer an agent’s beliefs are to the truth the more visible they are in the consensus building process. In all cases, applying the consensus operators results in the population converging to a single shared belief that is both crisp and certain. Furthermore, simulations that combine consensus formation with evidential updating converge more quickly to a shared opinion, which is closer to the actual state of the world than those in which beliefs are only changed as a result of directly receiving new evidence. Finally, if agent interactions are guided by belief quality measured as similarity to the true state of the world, then applying the consensus operator alone results in the population converging to a high-quality shared belief
Rule-based Modelling and Tunable Resolution
We investigate the use of an extension of rule-based modelling for cellular
signalling to create a structured space of model variants. This enables the
incremental development of rule sets that start from simple mechanisms and
which, by a gradual increase in agent and rule resolution, evolve into more
detailed descriptions
- …