399 research outputs found
Making friends on the fly : advances in ad hoc teamwork
textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations.
The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones.
We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science
Formal Methods for Autonomous Systems
Formal methods refer to rigorous, mathematical approaches to system
development and have played a key role in establishing the correctness of
safety-critical systems. The main building blocks of formal methods are models
and specifications, which are analogous to behaviors and requirements in system
design and give us the means to verify and synthesize system behaviors with
formal guarantees.
This monograph provides a survey of the current state of the art on
applications of formal methods in the autonomous systems domain. We consider
correct-by-construction synthesis under various formulations, including closed
systems, reactive, and probabilistic settings. Beyond synthesizing systems in
known environments, we address the concept of uncertainty and bound the
behavior of systems that employ learning using formal methods. Further, we
examine the synthesis of systems with monitoring, a mitigation technique for
ensuring that once a system deviates from expected behavior, it knows a way of
returning to normalcy. We also show how to overcome some limitations of formal
methods themselves with learning. We conclude with future directions for formal
methods in reinforcement learning, uncertainty, privacy, explainability of
formal methods, and regulation and certification
Recommended from our members
Reliable Decision-Making with Imprecise Models
The rapid growth in the deployment of autonomous systems across various sectors has generated considerable interest in how these systems can operate reliably in large, stochastic, and unstructured environments. Despite recent advances in artificial intelligence and machine learning, it is challenging to assure that autonomous systems will operate reliably in the open world. One of the causes of unreliable behavior is the impreciseness of the model used for decision-making. Due to the practical challenges in data collection and precise model specification, autonomous systems often operate based on models that do not represent all the details in the environment. Even if the system has access to a comprehensive decision-making model that accounts for all the details in the environment and all possible scenarios the agent may encounter, it may be intractable to solve this complex model optimally. Consequently, this complex, high fidelity model may be simplified to accelerate planning, introducing imprecision. Reasoning with such imprecise models affects the reliability of autonomous systems. A system\u27s actions may sometimes produce unexpected, undesirable consequences, which are often identified after deployment. How can we design autonomous systems that can operate reliably in the presence of uncertainty and model imprecision?
This dissertation presents solutions to address three classes of model imprecision in a Markov decision process, along with an analysis of the conditions under which bounded-performance can be guaranteed. First, an adaptive outcome selection approach is introduced to devise risk-aware reduced models of the environment that efficiently balance the trade-off between model simplicity and fidelity, to accelerate planning in resource-constrained settings. Second, a framework that extends stochastic shortest path framework to problems with imperfect information about the goal state during planning is introduced, along with two solution approaches to solve this problem. Finally, two complementary solution approaches are presented to minimize the negative side effects of agent actions. The techniques presented in this dissertation enable an autonomous system to detect and mitigate undesirable behavior, without redesigning the model entirely
Formal methods for motion planning and control in dynamic and partially known environments
This thesis is motivated by time and safety critical applications involving the use of autonomous vehicles to accomplish complex tasks in dynamic and partially known environments. We use temporal logic to formally express such complex tasks. Temporal logic specifications generalize the classical notions of stability and reachability widely studied within the control and hybrid systems communities. Given a model describing the motion of a robotic system in an environment and a formal task specification, the aim is to automatically synthesize a control policy that guarantees the satisfaction of the specification. This thesis presents novel control synthesis algorithms
to tackle the problem of motion planning from temporal logic specifications in uncertain environments. For each one of the planning and control synthesis problems addressed in this dissertation, the proposed algorithms are implemented, evaluated, and validated thought experiments and/or simulations.
The first part of this thesis focuses on a mobile robot whose success is measured by the completion of temporal logic tasks within a given period of time. In addition to such time constraints, the planning algorithm must also deal with the uncertainty that arises from the changes in the robot's workspace during task execution. In particular, we consider a robot deployed in a partitioned environment subjected to structural changes such as doors that can open and close. The motion of the robot is modeled
as a continuous time Markov decision process and the robot's mission is expressed as a Continuous Stochastic Logic (CSL) formula. A complete framework to find a control strategy that satisfies a specification given as a CSL formula is introduced.
The second part of this thesis addresses the synthesis of controllers that guarantee the satisfaction of a task specification expressed as a syntactically co-safe Linear Temporal Logic (scLTL) formula. In this case, uncertainty is characterized by the partial knowledge of the robot's environment. Two scenarios are considered. First, a distributed team of robots required to satisfy the specification over a set of service requests occurring at the vertices of a known graph representing the environment is
examined. Second, a single agent motion planning problem from the specification over a set of properties known to be satised at the vertices of the known graph environment is studied. In both cases, we exploit the existence of o-the-shelf model checking and runtime verification tools, the efficiency of graph search algorithms, and the efficacy of exploration techniques to solve the motion planning problem constrained by
the absence of complete information about the environment.
The final part of this thesis extends uncertainty beyond the absence of a complete knowledge of the environment described above by considering a robot equipped with a noisy sensing system. In particular, the robot is tasked with satisfying a scLTL specification over a set of regions of interest known to be present in the environment. In such a case, although the robot is able to measure the properties characterizing such regions of interest, precisely determining the identity of these regions is not feasible. A mixed observability Markov decision process is used to represent the robot's actuation and sensing models. The control synthesis problem from scLTL
formulas is then formulated as a maximum probability reachability problem on this model. The integration of dynamic programming, formal methods, and frontier-based exploration tools allow us to derive an algorithm to solve such a reachability problem
Agents for educational games and simulations
This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications
- …