30 research outputs found

    Multiagent Learning Through Indirect Encoding

    Get PDF
    Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding

    An intelligent peer-to-peer multi-agent system for collaborative management of bibliographic databases

    No full text
    This paper describes the design of a peer-to-peer system for collaborative management of distributed bibliographical databases. The goal of this system is twofold: firstly, it aims at providing help for users to manage their local bibliographical databases. Secondly, it offers the possibility to exchange bibliographical data among like-minded user groups in an implicit and intelligent manner. Each user is assisted by a personal agent that provides help such as: filling in bibliographical records, verifying the correctness of information entered and more importantly, recommendation of relevant bibliographical references. To do this, the personal agent needs to collaborate with its peers in order to get relevant recommendations. Each agent applies a case-based reasoning approach in order to provide peers with requested recommendations. The paper focuses mainly on describing the recommendation computation approach

    Towards Rapid Multi-robot Learning from Demonstration at the RoboCup Competition

    Full text link
    Abstract. We describe our previous and current efforts towards achiev-ing an unusual personal RoboCup goal: to train a full team of robots directly through demonstration, on the field of play at the RoboCup venue, how to collaboratively play soccer, and then use this trained team in the competition itself. Using our method, HiTAB, we can train teams of collaborative agents via demonstration to perform nontrivial joint behaviors in the form of hierarchical finite-state automata. We discuss HiTAB, our previous efforts in using it in RoboCup 2011 and 2012, recent experimental work, and our current efforts for 2014, then suggest a new RoboCup Technical Challenge problem in learning from demonstration. Imagine that you are at an unfamiliar disaster site with a team of robots, and are faced with a previously unseen task for them to do. The robots have only rudimentary but useful utility behaviors implemented. You are not a programmer. Without coding them, you have only a few hours to get your robots doing useful collaborative work in this new environment. How would you do this

    Inference-Based Deterministic Messaging For Multi-Agent Communication

    Full text link
    Communication is essential for coordination among humans and animals. Therefore, with the introduction of intelligent agents into the world, agent-to-agent and agent-to-human communication becomes necessary. In this paper, we first study learning in matrix-based signaling games to empirically show that decentralized methods can converge to a suboptimal policy. We then propose a modification to the messaging policy, in which the sender deterministically chooses the best message that helps the receiver to infer the sender's observation. Using this modification, we see, empirically, that the agents converge to the optimal policy in nearly all the runs. We then apply this method to a partially observable gridworld environment which requires cooperation between two agents and show that, with appropriate approximation methods, the proposed sender modification can enhance existing decentralized training methods for more complex domains as well.Comment: 13 pages, 10 figures. Accepted at accepted at the 35th AAAI Conference on Artificial Intelligence, 202

    Coherent Behavior in Multiagent System Based on Reinforcement Learning

    Get PDF
    Abstract This paper covers area of Collective Reinforcement Learning. We introduce and describe new simple approach to Collective Reinforcement Learning named Related Temporal Difference. This approach can supports coherence of agent's behavior in distributed and structurally complicated multi-agent system. We construct a decentralized Multi-Agent system which describes behaviors of multi-joint robot. Given experiments show, that system of local learning procedures in complex system can be much faster than learning system on the whole

    Чисельне розв’язання гідродинамічних задач з використанням нейронних мереж

    Get PDF
    The use of the neural network approach in the process of finding a solution of a hydrodynamic problem by numerical methods is proposed. The two areas of possible application of the technology of neural networks are considered – the choice of initial approximation to the solution and the search for the next approximation. To select the initial approximation, it is proposed to solve the combined task of classification and regression based on the existing base of distributions samples and on the existing base of patterns of space transformations. An architecture of the combined neural network which solves this problem is proposed. For finding the transformation of the problem space, the use of radial-basis neural network is proposed. The mathematical apparatus for tuning the network with arbitrary number of neurons of the output layer is proposed. It is proposed to take the number of hidden layer neurons less than the number of educational transformation samples and find an approximate solution. The task of finding the optimal weight values in this case can be considered as a task of minimization of the target function, which describes the network outputt error. It is proposed to construct a neural network for finding the next approximation of a solution of a hydrodynamic problem based on a generalization of the principle previously proposed for solving solutions of one-dimensional differential equations. Finding the next approximation in the case of solving a task on a multiprocessor system is presented as a game with multiple players, each of which must find a compromise between local and global search purposes. It is proposed to replace one common neural network with a set of neural networks that interact with each other. The proposed approaches can reduce the amount of computation needed to find a solution.Предложено использование подхода нейронной сети в процессе нахождения решения гидродинамической задачи численными методами. Рассматриваются две области возможного применения технологии нейронных сетей – выбор начального приближения к решению и поиск следующего приближения. Для выбора начального приближения предлагается решить комбинированную задачу классификации и регрессии на основе существующей базы выборок распределений и существующей базы моделей космических преобразований. Предложена архитектура объединенной нейронной сети, которая решает эту проблему. Для нахождения трансформации проблемного пространства предлагается использование радиально-базисной нейронной сети. Предложен математический аппарат для настройки сети с произвольным числом нейронов выходного слоя. Предлагается взять количество нейронов скрытого слоя меньше, чем количество образцов образовательной трансформации, и найти приблизительное решение. Задача поиска оптимальных значений веса в этом случае может рассматриваться как задача минимизации целевой функции, которая описывает ошибку выходного сигнала сети. Предлагается построить нейронную сеть для нахождения следующего приближения решения гидродинамической задачи, основанного на обобщении ранее предложенного принципа решения одномерных дифференциальных уравнений. Поиск следующего приближения в случае решения задачи на многопроцессорной системе представляется как игра с несколькими игроками, каждая из которых должна найти компромисс между локальными и глобальными поисковыми целями. Предлагается заменить одну общую нейронную сеть набором нейронных сетей, которые взаимодействуют друг с другом. Предлагаемые подходы могут уменьшить объем вычислений, необходимых для нахождения решения.Запропоновано використання нейромережевого підходу в процесі пошуку рішення гідродинамічної задачі числовими методами. Розглянуто дві області можливого застосування технології нейронних мереж - вибір вихідного наближення до рішення та пошук наступного наближення. Щоб вибрати початкове наближення, пропонується вирішити об'єднане завдання класифікації та регресії на основі існуючої бази зразків розподілу та існуючої бази моделей космічних перетворень. Запропонована архітектура об'єднаної нейронної мережі, яка вирішує цю проблему. Для пошуку перетворення проблемного простору запропоновано використання радіально-базисної нейронної мережі. Запропоновано математичний апарат для налаштування мережі з довільним числом нейронів вихідного шару. Пропонується прийняти число нейронів прихованого шару менше кількості зразків педагогічної трансформації та знайти приблизне рішення. Завдання пошуку оптимальних значень ваги в цьому випадку може розглядатися як завдання мінімізації цільової функції, яка описує помилку виведення мережі. Запропоновано побудувати нейронну мережу для знаходження наступного наближення рішення гідродинамічної задачі на основі узагальнення запропонованого раніше принципу для рішення рішень одномірних диференціальних рівнянь. Пошук наступного наближення у випадку вирішення завдання на багатопроцесорній системі представлений як гра з декількома гравцями, кожен з яких повинен знайти компроміс між місцевими та глобальними цілями пошуку. Запропоновано замінити одну загальну нейронну мережу на набір нейронних мереж, які взаємодіють один з одним. Запропоновані підходи можуть зменшити обсяги обчислень, необхідних для пошуку рішення

    Theoretical advantages of lenient learners : an evolutionary game theoretic perspective

    Get PDF
    This paper presents the dynamics of multiple learning agents from an evolutionary game theoretic perspective. We provide replicator dynamics models for cooperative coevolutionary algorithms and for traditional multiagent Q-learning, and we extend these differential equations to account for lenient learners: agents that forgive possible mismatched teammate actions that resulted in low rewards. We use these extended formal models to study the convergence guarantees for these algorithms, and also to visualize the basins of attraction to optimal and suboptimal solutions in two benchmark coordination problems. The paper demonstrates that lenience provides learners with more accurate information about the benefits of performing their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, the analysis indicates that the choice of learning algorithm has an insignificant impact on the overall performance of multiagent learning algorithms; rather, the performance of these algorithms depends primarily on the level of lenience that the agents exhibit to one another. Finally, the research herein supports the strength and generality of evolutionary game theory as a backbone for multiagent learning

    Integration of heterogeneous hypotheses in multiagent learning

    Get PDF
    Tese de Mestrado Integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
    corecore