10,724 research outputs found
Game Theory Models for the Verification of the Collective Behaviour of Autonomous Cars
The collective of autonomous cars is expected to generate almost optimal
traffic. In this position paper we discuss the multi-agent models and the
verification results of the collective behaviour of autonomous cars. We argue
that non-cooperative autonomous adaptation cannot guarantee optimal behaviour.
The conjecture is that intention aware adaptation with a constraint on
simultaneous decision making has the potential to avoid unwanted behaviour. The
online routing game model is expected to be the basis to formally prove this
conjecture.Comment: In Proceedings FVAV 2017, arXiv:1709.0212
Model-based provisioning and management of adaptive distributed communication in mobile cooperative systems
Adaptation of communication is required to maintain the reliable connection and to ensure the minimum quality in collaborative activities. Within the framework of wireless environment, how can host entities be handled in the event of a sudden unexpected change in communication and reliable sources? This challenging issue is addressed in the context of Emergency rescue system carried out by mobile devices and robots during calamities or disaster. For this kind of scenario, this book proposes an adaptive middleware to support reconfigurable, reliable group communications. Here, the system structure has been viewed at two different states, a control center with high processing power and uninterrupted energy level is responsible for global task and entities like autonomous robots and firemen owning smart devices act locally in the mission. Adaptation at control center is handled by semantic modeling whereas at local entities, it is managed by a software module called communication agent (CA). Modeling follows the well-known SWRL instructions which establish the degree of importance of each communication link or component. Providing generic and scalable solutions for automated self-configuration is driven by rule-based reconfiguration policies. To perform dynamically in changing environment, a trigger mechanism should force this model to take an adaptive action in order to accomplish a certain task, for example, the group chosen in the beginning of a mission need not be the same one during the whole mission. Local entity adaptive mechanisms are handled by CA that manages internal service APIs to configure, set up, and monitors communication services and manages the internal resources to satisfy telecom service requirements
Aprendizagem de coordenação em sistemas multi-agente
The ability for an agent to coordinate with others within a system is a
valuable property in multi-agent systems. Agents either cooperate as a team
to accomplish a common goal, or adapt to opponents to complete different
goals without being exploited. Research has shown that learning multi-agent
coordination is significantly more complex than learning policies in singleagent
environments, and requires a variety of techniques to deal with the
properties of a system where agents learn concurrently. This thesis aims to
determine how can machine learning be used to achieve coordination within
a multi-agent system. It asks what techniques can be used to tackle the
increased complexity of such systems and their credit assignment challenges,
how to achieve coordination, and how to use communication to improve the
behavior of a team.
Many algorithms for competitive environments are tabular-based, preventing
their use with high-dimension or continuous state-spaces, and may be
biased against specific equilibrium strategies. This thesis proposes multiple
deep learning extensions for competitive environments, allowing algorithms
to reach equilibrium strategies in complex and partially-observable environments,
relying only on local information. A tabular algorithm is also extended
with a new update rule that eliminates its bias against deterministic strategies.
Current state-of-the-art approaches for cooperative environments rely
on deep learning to handle the environment’s complexity and benefit from a
centralized learning phase. Solutions that incorporate communication between
agents often prevent agents from being executed in a distributed
manner. This thesis proposes a multi-agent algorithm where agents learn
communication protocols to compensate for local partial-observability, and
remain independently executed. A centralized learning phase can incorporate
additional environment information to increase the robustness and speed with
which a team converges to successful policies. The algorithm outperforms
current state-of-the-art approaches in a wide variety of multi-agent environments.
A permutation invariant network architecture is also proposed
to increase the scalability of the algorithm to large team sizes. Further research
is needed to identify how can the techniques proposed in this thesis,
for cooperative and competitive environments, be used in unison for mixed
environments, and whether they are adequate for general artificial intelligence.A capacidade de um agente se coordenar com outros num sistema Ă© uma
propriedade valiosa em sistemas multi-agente. Agentes cooperam como
uma equipa para cumprir um objetivo comum, ou adaptam-se aos oponentes
de forma a completar objetivos egoĂstas sem serem explorados. Investigação
demonstra que aprender coordenação multi-agente é significativamente
mais complexo que aprender estratégias em ambientes com um
único agente, e requer uma variedade de técnicas para lidar com um ambiente
onde agentes aprendem simultaneamente. Esta tese procura determinar
como aprendizagem automática pode ser usada para encontrar coordenação
em sistemas multi-agente. O documento questiona que técnicas podem ser
usadas para enfrentar a superior complexidade destes sistemas e o seu desafio
de atribuição de crédito, como aprender coordenação, e como usar
comunicação para melhorar o comportamento duma equipa.
MĂşltiplos algoritmos para ambientes competitivos sĂŁo tabulares, o que impede
o seu uso com espaços de estado de alta-dimensĂŁo ou contĂnuos, e
podem ter tendĂŞncias contra estratĂ©gias de equilĂbrio especĂficas. Esta tese
propõe múltiplas extensões de aprendizagem profunda para ambientes competitivos,
permitindo a algoritmos atingir estratĂ©gias de equilĂbrio em ambientes
complexos e parcialmente-observáveis, com base em apenas informação
local. Um algoritmo tabular é também extendido com um novo critério de
atualização que elimina a sua tendĂŞncia contra estratĂ©gias determinĂsticas.
Atuais soluções de estado-da-arte para ambientes cooperativos têm base em
aprendizagem profunda para lidar com a complexidade do ambiente, e beneficiam
duma fase de aprendizagem centralizada. Soluções que incorporam
comunicação entre agentes frequentemente impedem os próprios de ser executados
de forma distribuĂda. Esta tese propõe um algoritmo multi-agente
onde os agentes aprendem protocolos de comunicação para compensarem
por observabilidade parcial local, e continuam a ser executados de forma
distribuĂda. Uma fase de aprendizagem centralizada pode incorporar informação
adicional sobre ambiente para aumentar a robustez e velocidade
com que uma equipa converge para estratégias bem-sucedidas. O algoritmo
ultrapassa abordagens estado-da-arte atuais numa grande variedade de ambientes
multi-agente. Uma arquitetura de rede invariante a permutações é
também proposta para aumentar a escalabilidade do algoritmo para grandes
equipas. Mais pesquisa é necessária para identificar como as técnicas propostas
nesta tese, para ambientes cooperativos e competitivos, podem ser
usadas em conjunto para ambientes mistos, e averiguar se sĂŁo adequadas a
inteligência artificial geral.Apoio financeiro da FCT e do FSE no âmbito do III Quadro Comunitário de ApoioPrograma Doutoral em Informátic
Governance of mobile complexity: co-evolutionary management towards a resilient mobility in Flanders
Improving Traffic Safety and Efficiency by Adaptive Signal Control Systems Based on Deep Reinforcement Learning
As one of the most important Active Traffic Management strategies, Adaptive Traffic Signal Control (ATSC) helps improve traffic operation of signalized arterials and urban roads by adjusting the signal timing to accommodate real-time traffic conditions. Recently, with the rapid development of artificial intelligence, many researchers have employed deep reinforcement learning (DRL) algorithms to develop ATSCs. However, most of them are not practice-ready. The reasons are two-fold: first, they are not developed based on real-world traffic dynamics and most of them require the complete information of the entire traffic system. Second, their impact on traffic safety is always a concern by researchers and practitioners but remains unclear. Aiming at making the DRL-based ATSC more implementable, existing traffic detection systems on arterials were reviewed and investigated to provide high-quality data feeds to ATSCs. Specifically, a machine-learning frameworks were developed to improve the quality of and pedestrian and bicyclist\u27s count data. Then, to evaluate the effectiveness of DRL-based ATSC on the real-world traffic dynamics, a decentralized network-level ATSC using multi-agent DRL was developed and evaluated in a simulated real-world network. The evaluation results confirmed that the proposed ATSC outperforms the actuated traffic signals in the field in terms of travel time reduction. To address the potential safety issue of DRL based ATSC, an ATSC algorithm optimizing simultaneously both traffic efficiency and safety was proposed based on multi-objective DRL. The developed ATSC was tested in a simulated real-world intersection and it successfully improved traffic safety without deteriorating efficiency. In conclusion, the proposed ATSCs are capable of effectively controlling real-world traffic and benefiting both traffic efficiency and safety
- …