Search CORE

1,081 research outputs found

Aprendizagem de coordenação em sistemas multi-agente

Author: Simões David João Apolinário
Publication venue
Publication date: 07/05/2020
Field of study

The ability for an agent to coordinate with others within a system is a valuable property in multi-agent systems. Agents either cooperate as a team to accomplish a common goal, or adapt to opponents to complete different goals without being exploited. Research has shown that learning multi-agent coordination is significantly more complex than learning policies in singleagent environments, and requires a variety of techniques to deal with the properties of a system where agents learn concurrently. This thesis aims to determine how can machine learning be used to achieve coordination within a multi-agent system. It asks what techniques can be used to tackle the increased complexity of such systems and their credit assignment challenges, how to achieve coordination, and how to use communication to improve the behavior of a team. Many algorithms for competitive environments are tabular-based, preventing their use with high-dimension or continuous state-spaces, and may be biased against specific equilibrium strategies. This thesis proposes multiple deep learning extensions for competitive environments, allowing algorithms to reach equilibrium strategies in complex and partially-observable environments, relying only on local information. A tabular algorithm is also extended with a new update rule that eliminates its bias against deterministic strategies. Current state-of-the-art approaches for cooperative environments rely on deep learning to handle the environment’s complexity and benefit from a centralized learning phase. Solutions that incorporate communication between agents often prevent agents from being executed in a distributed manner. This thesis proposes a multi-agent algorithm where agents learn communication protocols to compensate for local partial-observability, and remain independently executed. A centralized learning phase can incorporate additional environment information to increase the robustness and speed with which a team converges to successful policies. The algorithm outperforms current state-of-the-art approaches in a wide variety of multi-agent environments. A permutation invariant network architecture is also proposed to increase the scalability of the algorithm to large team sizes. Further research is needed to identify how can the techniques proposed in this thesis, for cooperative and competitive environments, be used in unison for mixed environments, and whether they are adequate for general artificial intelligence.A capacidade de um agente se coordenar com outros num sistema é uma propriedade valiosa em sistemas multi-agente. Agentes cooperam como uma equipa para cumprir um objetivo comum, ou adaptam-se aos oponentes de forma a completar objetivos egoístas sem serem explorados. Investigação demonstra que aprender coordenação multi-agente é significativamente mais complexo que aprender estratégias em ambientes com um único agente, e requer uma variedade de técnicas para lidar com um ambiente onde agentes aprendem simultaneamente. Esta tese procura determinar como aprendizagem automática pode ser usada para encontrar coordenação em sistemas multi-agente. O documento questiona que técnicas podem ser usadas para enfrentar a superior complexidade destes sistemas e o seu desafio de atribuição de crédito, como aprender coordenação, e como usar comunicação para melhorar o comportamento duma equipa. Múltiplos algoritmos para ambientes competitivos são tabulares, o que impede o seu uso com espaços de estado de alta-dimensão ou contínuos, e podem ter tendências contra estratégias de equilíbrio específicas. Esta tese propõe múltiplas extensões de aprendizagem profunda para ambientes competitivos, permitindo a algoritmos atingir estratégias de equilíbrio em ambientes complexos e parcialmente-observáveis, com base em apenas informação local. Um algoritmo tabular é também extendido com um novo critério de atualização que elimina a sua tendência contra estratégias determinísticas. Atuais soluções de estado-da-arte para ambientes cooperativos têm base em aprendizagem profunda para lidar com a complexidade do ambiente, e beneficiam duma fase de aprendizagem centralizada. Soluções que incorporam comunicação entre agentes frequentemente impedem os próprios de ser executados de forma distribuída. Esta tese propõe um algoritmo multi-agente onde os agentes aprendem protocolos de comunicação para compensarem por observabilidade parcial local, e continuam a ser executados de forma distribuída. Uma fase de aprendizagem centralizada pode incorporar informação adicional sobre ambiente para aumentar a robustez e velocidade com que uma equipa converge para estratégias bem-sucedidas. O algoritmo ultrapassa abordagens estado-da-arte atuais numa grande variedade de ambientes multi-agente. Uma arquitetura de rede invariante a permutações é também proposta para aumentar a escalabilidade do algoritmo para grandes equipas. Mais pesquisa é necessária para identificar como as técnicas propostas nesta tese, para ambientes cooperativos e competitivos, podem ser usadas em conjunto para ambientes mistos, e averiguar se são adequadas a inteligência artificial geral.Apoio financeiro da FCT e do FSE no âmbito do III Quadro Comunitário de ApoioPrograma Doutoral em Informátic

Repositório Institucional da Universidade de Aveiro

Penerapan Algoritme Basic Theta* Pada Game Hexaconquest

Author: Firmansyah Ilman Naafian
Publication venue
Publication date: 17/01/2018
Field of study

Pada zaman sekarang, hampir semua game Turn-based Strategy memberikan fitur singleplayer pada tipe permainan yang dapat dilakukan pemainnya. Jika pemain manusia hanya satu orang, maka pemain lainnya harus digerakkan oleh komputer. Disinilah peran AI(Artificial Intelligence) atau kecerdasan buatan. Kecerdasan buatan digunakan pada game agar pemain manusia dapat merasa seakan-akan melawan manusia sehingga dia dapat melatih kemampuan bermainnya terlebih dahulu dengan melawan komputer sebelum melawan pemain manusia lain. Algoritme yang sering digunakan oleh AI pada game untuk mencari jalan terbaik menuju lokasi tujuannya adalah algoritme A*. Namun, tidak selalu A* merupakan solusi terbaik dalam pathfinding. Penulis mencoba menerapkan algoritme Basic Theta* pada game strategi berbasis giliran atau Turn-Based Strategy yang bernama Hexaconquest. Algoritme pathfinding Basic Theta* akan dibandingkan dengan algoritme pathfinding dasar pada game Hexaconquest yakni algoritme A*. Performa kedua algoritme akan dibandingkan dengan melihat jumlah frame per second, waktu eksekusi, dan jumlah cost node yang dilewati oleh agen algoritme. Dari hasil penelitian ini dapat disimpulkan bahwa algoritme Basic Theta* mampu mencari rute yang lebih pendek untuk setiap pergerakan agennya, namun performa algoritme ini masih kurang baik dibandingkan dengan performa algoritme A*. Algoritme Basic Theta* dapat memberikan solusi dengan jarak terpendek, sedangkan A* dapat memberikan solusi dengan lebih cepat dan ringan

bkg

Computação evolutiva aplicada a jogos no estilo presa-predador

Author: Silva Rafael Melo
Publication venue: Ciência da Computação
Publication date: 18/12/2019
Field of study

UFU - Universidade Federal de UberlândiaTrabalho de Conclusão de Curso (Graduação)O modelo Presa-Predador representa um problema clássico da literatura. Esse modelo é caracterizado pelos objetivos de captura e fuga, e pode ser encontrado em vários jogos eletrônicos, desde os mais antigos aos mais modernos. Atualmente existe por parte dos jogadores um desejo de que os jogos sejam mais inteligentes e adaptáveis. Neste contexto, a Inteligência Artificial tem sido amplamente aplicada e o uso da Computação Evolutiva e, principalmente, dos Algoritmos Genéticos tem crescido. Neste trabalho é realizado o desenvolvimento de um Algoritmo Genético e de um jogo com modelagem Presa-Predador, com uma relação de cadeia alimentar entre os personagens. O objetivo do trabalho é que a aplicação do Algoritmo Genético para melhorar os atributos dos Non Player Characters gere comportamento adaptativo nestes personagens durante o jogo

Repositório Institucional da Universidade Federal de Uberlândia

Sensorimotor neural systems for a predatory stealth behaviour camouflaging motion

Author: Anderson Andrew
Publication venue
Publication date: 30/12/2013
Field of study

A thesis submitted to the University of London in partial fulfillment of the requirements for the admission to the degree of Doctor of Philosophy

Queen Mary Research Online

A Survey on Aerial Swarm Robotics

Author: Chung Soon-Jo
Dames Philip
Kumar Vijay
Paranjape Aditya
Shen Shaojie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2018
Field of study

The use of aerial swarms to solve real-world problems has been increasing steadily, accompanied by falling prices and improving performance of communication, sensing, and processing hardware. The commoditization of hardware has reduced unit costs, thereby lowering the barriers to entry to the field of aerial swarm robotics. A key enabling technology for swarms is the family of algorithms that allow the individual members of the swarm to communicate and allocate tasks amongst themselves, plan their trajectories, and coordinate their flight in such a way that the overall objectives of the swarm are achieved efficiently. These algorithms, often organized in a hierarchical fashion, endow the swarm with autonomy at every level, and the role of a human operator can be reduced, in principle, to interactions at a higher level without direct intervention. This technology depends on the clever and innovative application of theoretical tools from control and estimation. This paper reviews the state of the art of these theoretical tools, specifically focusing on how they have been developed for, and applied to, aerial swarms. Aerial swarms differ from swarms of ground-based vehicles in two respects: they operate in a three-dimensional space and the dynamics of individual vehicles adds an extra layer of complexity. We review dynamic modeling and conditions for stability and controllability that are essential in order to achieve cooperative flight and distributed sensing. The main sections of this paper focus on major results covering trajectory generation, task allocation, adversarial control, distributed sensing, monitoring, and mapping. Wherever possible, we indicate how the physics and subsystem technologies of aerial robots are brought to bear on these individual areas

Caltech Authors