5 research outputs found
Genetic programming for the RoboCup Rescue Simulation System
The Robocup Rescue Simulation System (RCRSS) is a dynamic system of multi-agent interaction, simulating a large-scale urban disaster scenario. Teams of rescue agents are charged with the tasks of minimizing civilian casualties and infrastructure damage while competing against limitations on time, communication, and awareness. This thesis provides the first known attempt of applying Genetic Programming (GP) to the development of behaviours necessary to perform well in the RCRSS. Specifically, this thesis studies the suitability of GP to evolve the operational behaviours required of each type of rescue agent in the RCRSS. The system developed is evaluated in terms of the consistency with which expected solutions are the target of convergence as well as by comparison to previous competition results. The results indicate that GP is capable of converging to some forms of expected behaviour, but that additional evolution in strategizing behaviours must be performed in order to become competitive. An enhancement to the standard GP algorithm is proposed which is shown to simplify the initial search space allowing evolution to occur much quicker. In addition, two forms of population are employed and compared in terms of their apparent effects on the evolution of control structures for intelligent rescue agents. The first is a single population in which each individual is comprised of three distinct trees for the respective control of three types of agents, the second is a set of three co-evolving subpopulations one for each type of agent. Multiple populations of cooperating individuals appear to achieve higher proficiencies in training, but testing on unseen instances raises the issue of overfitting
Recommended from our members
Multi-agent deep reinforcement learning for Robocup Rescue Simulator
Recent development in the field of Artificial Intelligence have dealt with building a winning strategy for video games where agents learn how to finish their task successfully using Deep Reinforcement Learning (DRL). The first major breakthrough came when Mnih et al. [22] showed how a DRL algorithm, termed Deep Q-Networks (DQN), can be applied to a collection of Atari 2600 games to surpass the performance of all previous algorithms and achieve a level that is comparable to a professional player. Their trained model received only raw pixels and game score as inputs to learn successful policies for single agents and was able to outperform professionals across a set of 49 Atari games. After a few years, focus shifted on training multiple agents using DRL, often known as multi-agent deep reinforcement learning (MADRL), for real time strategy games. Brockman et al. [4] achieved superhuman performance in the game of DOTA 2 which involves multi-agent collaboration, spatial and temporal reasoning, adversarial planning, and opponent modeling. Using Proximal Policy Optimization (PPO) algorithm and a LSTM layer as the primary component of the neural network, their trained model was able to defeat the human champion team, Team OG by 2:0. Most recently, Vinyals et al. [38] showed how a MADRL model can achieve grandmaster level in the game of StarCraft II. In this work, we apply MADRL to RoboCup Rescue Simulator (RCRS), which is part of the annual RoboCup Competition. RCRS is an open-source virtual environment that evaluates how effective multiple agents like ambulance team, police officer and fire brigades are in rescuing civilians and extinguishing fire from a city where an earthquake just happened. RCRS is challenging, easy to use and customize multi-agent scenario. In order to create RCRS environment where deep reinforcement learning algorithms can be tested, RCRS-gym, an open-source OpenAI Gym environment was developed. In this report, we have focused on training multiple fire brigades to collaboratively accomplish their task of extinguishing fire in the city. Fire Brigades were trained using two DRL algorithms: DQN and PPO. The performance of the algorithms was then compared with a greedy approach on two different map setting, "Small" map and "Big" map, each having different number of fire brigades and buildings. The agents were able to successfully finish their task of extinguishing fire on both map setting thus proving that RCRS is a suitable environment for developing deep reinforcement learning agent in a strategic multiagent game scenario. DQN outperformed PPO in the "Small" map setting while PPO outperformed a variant to DQN, H-DQN in the "Big" map setting. However, both the algorithms were not able to significantly outperform the greedy approach in either setting which opens up a promising avenue for future researchOperations Research and Industrial Engineerin
An historical based adaptation mechanism for BDI agents
One of the limitations of the BDI (Belief-Desire-Intention) model is the lack of any explicit mechanisms within the architecture to be able to learn. In particular, BDI agents do not possess the ability to adapt based on past experience. This is important in dynamic environments as they can change, causing previously successful methods for achieving goals to become inefficient or ineffective. We present a model in which learning, analogous reasoning, data pruning and learner accuracy evaluation can be utilised by a BDI agent and verify this model experimentally using Inductive and Statistical learning. Intelligent Agents are a new way of developing software applications. They are an amalgam of Artificial Intelligence (AI) and Software Engineering concepts that are highly suited to domains that are inherently complex and dynamic. Agents are software entities that are autonomous, reactive, proactive, situated and social. They are autonomous in that they are able to make decisions on their own volition. They are situated in some environment and are reactive to this environment yet are also capable of proactive behaviour where they actively pursue goals. They are capable of social behaviour where communication can occur between agents. BDI (Belief Desire Intention) agents are one popular type of agent that support complex behaviour in dynamic environments. Agent adaptation can be viewed as the process of changing the way in which an agent achieves its goals. We distinguish between 'reactive' or short-term adaptation, 'long-term' or historical adaptation and 'very long term' or evolutionary adaptation. Short-term adaptation, an ability that current BDI agents already possess, involves reacting to changes in the environment and choosing alternative plans of action which may involve choosing new plans if the current plan fails. 'Long-term' or historical adaptation entails the use of past cases during the reasoning process which enables agents to avoid repeating past mistakes. 'Evolutionary adaptation' could involve the use of genetic programming or similar techniques to mutate plans to lead to altered behaviour. Our work aims to improve BDI agents by introducing a framework that allows BDI agents to alter their behaviour based on past experience, i.e. to learn
Artificial Intelligence Through the Eyes of the Public
Artificial Intelligence is becoming a popular field in computer science. In this report we explored its history, major accomplishments and the visions of its creators. We looked at how Artificial Intelligence experts influence reporting and engineered a survey to gauge public opinion. We also examined expert predictions concerning the future of the field as well as media coverage of its recent accomplishments. These results were then used to explore the links between expert opinion, public opinion and media coverage
Bowdoin Orient v.136, no.1-25 (2006-2007)
https://digitalcommons.bowdoin.edu/bowdoinorient-2000s/1007/thumbnail.jp