1,056 research outputs found
Avoiding convergence in cooperative coevolution with novelty search
Cooperative coevolution is an approach for evolving solutions composed of coadapted components. Previous research
has shown, however, that cooperative coevolutionary algorithms are biased towards stability: they tend to converge
prematurely to equilibrium states, instead of converging to
optimal or near-optimal solutions. In single-population evolutionary algorithms, novelty search has been shown capable of avoiding premature convergence to local optima —
a pathology similar to convergence to equilibrium states.
In this study, we demonstrate how novelty search can be
applied to cooperative coevolution by proposing two new
algorithms. The first algorithm promotes behavioural novelty at the team level (NS-T), while the second promotes
novelty at the individual agent level (NS-I). The proposed
algorithms are evaluated in two popular multiagent tasks:
predator-prey pursuit and keepaway soccer. An analysis
of the explored collaboration space shows that (i) fitnessbased evolution tends to quickly converge to poor equilibrium states, (ii) NS-I almost never reaches any equilibrium
state due to constant change in the individual populations,
while (iii) NS-T explores a variety of equilibrium states in
each evolutionary run and thus significantly outperforms
both fitness-based evolution and NS-I.info:eu-repo/semantics/acceptedVersio
Novelty-driven cooperative coevolution
Cooperative coevolutionary algorithms (CCEAs) rely on multiple coevolving populations for the evolution of solutions composed of coadapted components. CCEAs enable, for instance, the evolution of cooperative multiagent systems composed of heterogeneous agents, where each agent is modelled as a component of the solution. Previous works have, however, shown that CCEAs are biased toward stability: the evolutionary process tends to converge prematurely to stable states instead of (near-)optimal solutions. In this study, we show how novelty search can be used to avoid the counterproductive attraction to stable states in coevolution. Novelty search is an evolutionary technique that drives evolution toward behavioural novelty and diversity rather than exclusively pursuing a static objective. We evaluate three novelty-based approaches that rely on, respectively (1) the novelty of the team as a whole, (2) the novelty of the agents’ individual behaviour, and (3) the combination of the two. We compare the proposed approaches with traditional fitness-driven cooperative coevolution in three simulated multirobot tasks. Our results show that team-level novelty scoring is the most effective approach, significantly outperforming fitness-driven coevolution at multiple levels. Novelty-driven cooperative coevolution can substantially increase the potential of CCEAs while maintaining a computational complexity that scales well with the number of populations.info:eu-repo/semantics/publishedVersio
On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters
This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p
A Multiagent Deep Reinforcement Learning Approach for Path Planning in Autonomous Surface Vehicles: The Ypacaraà Lake Patrolling Case
Article number 9330612Autonomous surfaces vehicles (ASVs) excel at monitoring and measuring aquatic nutrients
due to their autonomy, mobility, and relatively low cost. When planning paths for such vehicles, the task
of patrolling with multiple agents is usually addressed with heuristics approaches, such as Reinforcement
Learning (RL), because of the complexity and high dimensionality of the problem. Not only do efficient paths
have to be designed, but addressing disturbances in movement or the battery’s performance is mandatory.
For this multiagent patrolling task, the proposed approach is based on a centralized Convolutional Deep
Q-Network, designed with a final independent dense layer for every agent to deal with scalability, with the
hypothesis/assumption that every agent has the same properties and capabilities. For this purpose, a tailored
reward function is created which penalizes illegal actions (such as collisions) and rewards visiting idle
cells (cells that remains unvisited for a long time). A comparison with various multiagent Reinforcement
Learning (MARL) algorithms has been done (Independent Q-Learning, Dueling Q-Network and multiagent
Double Deep Q-Learning) in a case-study scenario like the Ypacaraà lake in Asunción (Paraguay). The
training results in multiagent policy leads to an average improvement of 15% compared to lawn mower
trajectories and a 6% improvement over the IDQL for the case-study considered. When evaluating the
training speed, the proposed approach runs three times faster than the independent algorithm.Ministerio de Ciencia, Innovación y Universidades (España) RTI2018-098964-B-I00Junta de AndalucÃa(España) PY18-RE000
Multiagent Learning Through Indirect Encoding
Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
- …