889 research outputs found
Rule-Based Policy Interpretation and Shielding for Partially Observable Monte Carlo Planning
Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit representation of the policy hinders interpretability. In this thesis, we propose a methodology based on Maximum Satisfiability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief-action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. This representation exploits a high-level description encoded using logical formulas that domain experts can provide. The final formula can be used to identify unexpected decisions, namely, decisions that violate the expert indications. We show that this identification process can be used offline (to improve the explainability of the policy and to identify anomalous behaviours) or online (to shield the decisions of the POMCP algorithm). We also present an active methodology that can effectively query a POMCP policy to build more reliable descriptions quickly. We extensively evaluate our methodologies on two standard benchmarks for POMDPs, namely, emph{tiger} and emph{rocksample}, and on a problem related to velocity regulation in mobile robot navigation. Results show that our approach achieves good performance due to its capability to exploit experts' knowledge of the domains. Specifically, our approach can be used both to identify anomalous behaviours in faulty POMCPs and to improve the performance of the system by using the shielding mechanism. In the first case, we test the methodology against a state-of-the-art anomaly detection algorithm, while in the second, we compared the performance of shielded and unshielded POMCPs. We implemented our methodology in CC, and the code is open-source and available at href{https://github.com/GiuMaz/XPOMCP}{https://github.com/GiuMaz/XPOMCP}
10081 Abstracts Collection -- Cognitive Robotics
From 21.02. to 26.02.2010, the Dagstuhl Seminar 10081 ``Cognitive Robotics \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Risk-aware shielding of Partially Observable Monte Carlo Planning policies
Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit policy representation hinders interpretability and a proper evaluation of the risks an agent may incur. In this work, we propose a methodology based on Maximum Satisfiability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief- action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. Moreover, we introduce a rich and formal language that a domain expert can use to describe the expected behaviour of a policy. In more detail, we present a formulation that directly computes the risk involved in taking actions by considering the high- level elements specified by the expert. The final formula can identify risky decisions taken by POMCP that violate the expert indications. We show that this identification process can be used offline (to improve the policy’s explainability and identify anomalous behaviours) or online (to shield the risky decisions of the POMCP algorithm). We present an extended evaluation of our approach on four domains: the well-known tiger and rocksample benchmarks, a problem of velocity regulation in mobile robots, and a problem of battery management in mobile robots. We test the methodology against a state-of- the-art anomaly detection algorithm to show that our approach can be used to identify anomalous behaviours in faulty POMCP. We also show, comparing the performance of shielded and unshielded POMCP, that the shielding mechanism can improve the system’s performance. We provide an open-source implementation of the proposed methodologies at https://github.com/GiuMaz/XPOMCP
A prototype for a conversational companion for reminiscing about images
This work was funded by the COMPANIONS project sponsored by the European Commission as part of the Information Society Technologies (IST) programme under EC grant number IST-FP6-034434. Companions demonstrators can be seen at: http://www.dcs.shef.ac.uk/∼roberta/companions/Web/.This paper describes an initial prototype of the Companions project (www.companions-project.org): the Senior Companion (SC), designed to be a platform to display novel approaches to: (1) The use of Information Extraction (IE) techniques to extract the content of incoming dialogue utterances after an ASR phase. (2) The conversion of the input to RDF form to allow the generation of new facts from existing ones, under the control of a Dialogue Manager (DM), that also has access to stored knowledge and knowledge accessed in real time from the web, all in RDF form. (3) A DM expressed as a stack and network virtual machine that models mixed initiative in dialogue control. (4) A tuned dialogue act detector based on corpus evidence. The prototype platform was evaluated, and we describe this; it is also designed to support more extensive forms of emotion detection carried by both speech and lexical content, as well as extended forms of machine learning. We describe preliminary studies and results for these, in particular a novel approach to enabling reinforcement learning for open dialogue systems through the detection of emotion in the speech signal and its deployment as a form of a learned DM, at a higher level than the DM virtual machine and able to direct the SC’s responses to a more emotionally appropriate part of its repertoire. © 2010 Elsevier Ltd. All rights reserved.peer-reviewe
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Machine Learning-Aided Operations and Communications of Unmanned Aerial Vehicles: A Contemporary Survey
The ongoing amalgamation of UAV and ML techniques is creating a significant
synergy and empowering UAVs with unprecedented intelligence and autonomy. This
survey aims to provide a timely and comprehensive overview of ML techniques
used in UAV operations and communications and identify the potential growth
areas and research gaps. We emphasise the four key components of UAV operations
and communications to which ML can significantly contribute, namely, perception
and feature extraction, feature interpretation and regeneration, trajectory and
mission planning, and aerodynamic control and operation. We classify the latest
popular ML tools based on their applications to the four components and conduct
gap analyses. This survey also takes a step forward by pointing out significant
challenges in the upcoming realm of ML-aided automated UAV operations and
communications. It is revealed that different ML techniques dominate the
applications to the four key modules of UAV operations and communications.
While there is an increasing trend of cross-module designs, little effort has
been devoted to an end-to-end ML framework, from perception and feature
extraction to aerodynamic control and operation. It is also unveiled that the
reliability and trust of ML in UAV operations and applications require
significant attention before full automation of UAVs and potential cooperation
between UAVs and humans come to fruition.Comment: 36 pages, 304 references, 19 Figure
- …