598 research outputs found
Information-theoretic Reasoning in Distributed and Autonomous Systems
The increasing prevalence of distributed and autonomous systems is transforming decision making in industries as diverse as agriculture, environmental monitoring, and healthcare. Despite significant efforts, challenges remain in robustly planning under uncertainty. In this thesis, we present a number of information-theoretic decision rules for improving the analysis and control of complex adaptive systems. We begin with the problem of quantifying the data storage (memory) and transfer (communication) within information processing systems. We develop an information-theoretic framework to study nonlinear interactions within cooperative and adversarial scenarios, solely from observations of each agent's dynamics. This framework is applied to simulations of robotic soccer games, where the measures reveal insights into team performance, including correlations of the information dynamics to the scoreline. We then study the communication between processes with latent nonlinear dynamics that are observed only through a filter. By using methods from differential topology, we show that the information-theoretic measures commonly used to infer communication in observed systems can also be used in certain partially observed systems. For robotic environmental monitoring, the quality of data depends on the placement of sensors. These locations can be improved by either better estimating the quality of future viewpoints or by a team of robots operating concurrently. By robustly handling the uncertainty of sensor model measurements, we are able to present the first end-to-end robotic system for autonomously tracking small dynamic animals, with a performance comparable to human trackers. We then solve the issue of coordinating multi-robot systems through distributed optimisation techniques. These allow us to develop non-myopic robot trajectories for these tasks and, importantly, show that these algorithms provide guarantees for convergence rates to the optimal payoff sequence
Communication Efficiency in Information Gathering through Dynamic Information Flow
This thesis addresses the problem of how to improve the performance of multi-robot information gathering tasks by actively controlling the rate of communication between robots. Examples of such tasks include cooperative tracking and cooperative environmental monitoring. Communication is essential in such systems for both decentralised data fusion and decision making, but wireless networks impose capacity constraints that are frequently overlooked. While existing research has focussed on improving available communication throughput, the aim in this thesis is to develop algorithms that make more efficient use of the available communication capacity. Since information may be shared at various levels of abstraction, another challenge is the decision of where information should be processed based on limits of the computational resources available. Therefore, the flow of information needs to be controlled based on the trade-off between communication limits, computation limits and information value. In this thesis, we approach the trade-off by introducing the dynamic information flow (DIF) problem. We suggest variants of DIF that either consider data fusion communication independently or both data fusion and decision making communication simultaneously. For the data fusion case, we propose efficient decentralised solutions that dynamically adjust the flow of information. For the decision making case, we present an algorithm for communication efficiency based on local LQ approximations of information gathering problems. The algorithm is then integrated with our solution for the data fusion case to produce a complete communication efficiency solution for information gathering. We analyse our suggested algorithms and present important performance guarantees. The algorithms are validated in a custom-designed decentralised simulation framework and through field-robotic experimental demonstrations
Learning and Co-operation in Mobile Multi-Robot Systems
Merged with duplicate record 10026.1/1984 on 27.02.2017 by CS (TIS)This thesis addresses the problem of setting the balance between exploration and
exploitation in teams of learning robots who exchange information. Specifically it looks at
groups of robots whose tasks include moving between salient points in the environment.
To deal with unknown and dynamic environments,such robots need to be able to discover
and learn the routes between these points themselves. A natural extension of this scenario
is to allow the robots to exchange learned routes so that only one robot needs to learn a
route for the whole team to use that route. One contribution of this thesis is to identify a
dilemma created by this extension: that once one robot has learned a route between two
points, all other robots will follow that route without looking for shorter versions. This
trade-off will be labeled the Distributed Exploration vs. Exploitation Dilemma, since
increasing distributed exploitation (allowing robots to exchange more routes) means
decreasing distributed exploration (reducing robots ability to learn new versions of routes),
and vice-versa. At different times, teams may be required with different balances of
exploitation and exploration. The main contribution of this thesis is to present a system for
setting the balance between exploration and exploitation in a group of robots. This system
is demonstrated through experiments involving simulated robot teams. The experiments
show that increasing and decreasing the value of a parameter of the novel system will lead
to a significant increase and decrease respectively in average exploitation (and an
equivalent decrease and increase in average exploration) over a series of team missions. A
further set of experiments show that this holds true for a range of team sizes and numbers
of goals
Aprendizagem de coordenação em sistemas multi-agente
The ability for an agent to coordinate with others within a system is a
valuable property in multi-agent systems. Agents either cooperate as a team
to accomplish a common goal, or adapt to opponents to complete different
goals without being exploited. Research has shown that learning multi-agent
coordination is significantly more complex than learning policies in singleagent
environments, and requires a variety of techniques to deal with the
properties of a system where agents learn concurrently. This thesis aims to
determine how can machine learning be used to achieve coordination within
a multi-agent system. It asks what techniques can be used to tackle the
increased complexity of such systems and their credit assignment challenges,
how to achieve coordination, and how to use communication to improve the
behavior of a team.
Many algorithms for competitive environments are tabular-based, preventing
their use with high-dimension or continuous state-spaces, and may be
biased against specific equilibrium strategies. This thesis proposes multiple
deep learning extensions for competitive environments, allowing algorithms
to reach equilibrium strategies in complex and partially-observable environments,
relying only on local information. A tabular algorithm is also extended
with a new update rule that eliminates its bias against deterministic strategies.
Current state-of-the-art approaches for cooperative environments rely
on deep learning to handle the environment’s complexity and benefit from a
centralized learning phase. Solutions that incorporate communication between
agents often prevent agents from being executed in a distributed
manner. This thesis proposes a multi-agent algorithm where agents learn
communication protocols to compensate for local partial-observability, and
remain independently executed. A centralized learning phase can incorporate
additional environment information to increase the robustness and speed with
which a team converges to successful policies. The algorithm outperforms
current state-of-the-art approaches in a wide variety of multi-agent environments.
A permutation invariant network architecture is also proposed
to increase the scalability of the algorithm to large team sizes. Further research
is needed to identify how can the techniques proposed in this thesis,
for cooperative and competitive environments, be used in unison for mixed
environments, and whether they are adequate for general artificial intelligence.A capacidade de um agente se coordenar com outros num sistema é uma
propriedade valiosa em sistemas multi-agente. Agentes cooperam como
uma equipa para cumprir um objetivo comum, ou adaptam-se aos oponentes
de forma a completar objetivos egoístas sem serem explorados. Investigação
demonstra que aprender coordenação multi-agente é significativamente
mais complexo que aprender estratégias em ambientes com um
único agente, e requer uma variedade de técnicas para lidar com um ambiente
onde agentes aprendem simultaneamente. Esta tese procura determinar
como aprendizagem automática pode ser usada para encontrar coordenação
em sistemas multi-agente. O documento questiona que técnicas podem ser
usadas para enfrentar a superior complexidade destes sistemas e o seu desafio
de atribuição de crédito, como aprender coordenação, e como usar
comunicação para melhorar o comportamento duma equipa.
Múltiplos algoritmos para ambientes competitivos são tabulares, o que impede
o seu uso com espaços de estado de alta-dimensão ou contínuos, e
podem ter tendências contra estratégias de equilíbrio específicas. Esta tese
propõe múltiplas extensões de aprendizagem profunda para ambientes competitivos,
permitindo a algoritmos atingir estratégias de equilíbrio em ambientes
complexos e parcialmente-observáveis, com base em apenas informação
local. Um algoritmo tabular é também extendido com um novo critério de
atualização que elimina a sua tendência contra estratégias determinísticas.
Atuais soluções de estado-da-arte para ambientes cooperativos têm base em
aprendizagem profunda para lidar com a complexidade do ambiente, e beneficiam
duma fase de aprendizagem centralizada. Soluções que incorporam
comunicação entre agentes frequentemente impedem os próprios de ser executados
de forma distribuída. Esta tese propõe um algoritmo multi-agente
onde os agentes aprendem protocolos de comunicação para compensarem
por observabilidade parcial local, e continuam a ser executados de forma
distribuída. Uma fase de aprendizagem centralizada pode incorporar informação
adicional sobre ambiente para aumentar a robustez e velocidade
com que uma equipa converge para estratégias bem-sucedidas. O algoritmo
ultrapassa abordagens estado-da-arte atuais numa grande variedade de ambientes
multi-agente. Uma arquitetura de rede invariante a permutações é
também proposta para aumentar a escalabilidade do algoritmo para grandes
equipas. Mais pesquisa é necessária para identificar como as técnicas propostas
nesta tese, para ambientes cooperativos e competitivos, podem ser
usadas em conjunto para ambientes mistos, e averiguar se são adequadas a
inteligência artificial geral.Apoio financeiro da FCT e do FSE no âmbito do III Quadro Comunitário de ApoioPrograma Doutoral em Informátic
- …