Search CORE

889 research outputs found

Rule-Based Policy Interpretation and Shielding for Partially Observable Monte Carlo Planning

Author: Alberto Castellini
Alessandro Farinelli
Giulio Mazzi
Publication venue
Publication date: 01/01/2022
Field of study

Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit representation of the policy hinders interpretability. In this thesis, we propose a methodology based on Maximum Satisfiability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief-action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. This representation exploits a high-level description encoded using logical formulas that domain experts can provide. The final formula can be used to identify unexpected decisions, namely, decisions that violate the expert indications. We show that this identification process can be used offline (to improve the explainability of the policy and to identify anomalous behaviours) or online (to shield the decisions of the POMCP algorithm). We also present an active methodology that can effectively query a POMCP policy to build more reliable descriptions quickly. We extensively evaluate our methodologies on two standard benchmarks for POMDPs, namely, emph{tiger} and emph{rocksample}, and on a problem related to velocity regulation in mobile robot navigation. Results show that our approach achieves good performance due to its capability to exploit experts' knowledge of the domains. Specifically, our approach can be used both to identify anomalous behaviours in faulty POMCPs and to improve the performance of the system by using the shielding mechanism. In the first case, we test the methodology against a state-of-the-art anomaly detection algorithm, while in the second, we compared the performance of shielded and unshielded POMCPs. We implemented our methodology in CC, and the code is open-source and available at href{https://github.com/GiuMaz/XPOMCP}{https://github.com/GiuMaz/XPOMCP}

Catalogo dei prodotti della ricerca

10081 Abstracts Collection -- Cognitive Robotics

Author: Lakemeyer Gerhard
Levesque Hector J.
Pirri Fiora
Publication venue: Dagstuhl Seminar Proceedings. 10081 - Cognitive Robotics
Publication date: 01/01/2010
Field of study

From 21.02. to 26.02.2010, the Dagstuhl Seminar 10081 ``Cognitive Robotics \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Risk-aware shielding of Partially Observable Monte Carlo Planning policies

Author: Alberto Castellini
Alessandro Farinelli
Giulio Mazzi
Publication venue
Publication date: 01/01/2023
Field of study

Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit policy representation hinders interpretability and a proper evaluation of the risks an agent may incur. In this work, we propose a methodology based on Maximum Satisﬁability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief- action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. Moreover, we introduce a rich and formal language that a domain expert can use to describe the expected behaviour of a policy. In more detail, we present a formulation that directly computes the risk involved in taking actions by considering the high- level elements speciﬁed by the expert. The ﬁnal formula can identify risky decisions taken by POMCP that violate the expert indications. We show that this identiﬁcation process can be used oﬄine (to improve the policy’s explainability and identify anomalous behaviours) or online (to shield the risky decisions of the POMCP algorithm). We present an extended evaluation of our approach on four domains: the well-known tiger and rocksample benchmarks, a problem of velocity regulation in mobile robots, and a problem of battery management in mobile robots. We test the methodology against a state-of- the-art anomaly detection algorithm to show that our approach can be used to identify anomalous behaviours in faulty POMCP. We also show, comparing the performance of shielded and unshielded POMCP, that the shielding mechanism can improve the system’s performance. We provide an open-source implementation of the proposed methodologies at https://github.com/GiuMaz/XPOMCP

Catalogo dei prodotti della ricerca

Robotics deep reinforcement learning with loose prior knowledge

Author: Botteghi Nicolò
Publication venue: University of Twente
Publication date: 06/10/2021
Field of study

University of Twente Research Information

A prototype for a conversational companion for reminiscing about images

Author: Catizone Roberta
Cheng Weiwei
Dingli Alexiei
Field Debora
Moore Roger
Wilks Yorick
Worgan Simon
Publication venue: 'Elsevier BV'
Publication date: 01/04/2011
Field of study

This work was funded by the COMPANIONS project sponsored by the European Commission as part of the Information Society Technologies (IST) programme under EC grant number IST-FP6-034434. Companions demonstrators can be seen at: http://www.dcs.shef.ac.uk/∼roberta/companions/Web/.This paper describes an initial prototype of the Companions project (www.companions-project.org): the Senior Companion (SC), designed to be a platform to display novel approaches to: (1) The use of Information Extraction (IE) techniques to extract the content of incoming dialogue utterances after an ASR phase. (2) The conversion of the input to RDF form to allow the generation of new facts from existing ones, under the control of a Dialogue Manager (DM), that also has access to stored knowledge and knowledge accessed in real time from the web, all in RDF form. (3) A DM expressed as a stack and network virtual machine that models mixed initiative in dialogue control. (4) A tuned dialogue act detector based on corpus evidence. The prototype platform was evaluated, and we describe this; it is also designed to support more extensive forms of emotion detection carried by both speech and lexical content, as well as extended forms of machine learning. We describe preliminary studies and results for these, in particular a novel approach to enabling reinforcement learning for open dialogue systems through the detection of emotion in the speech signal and its deployment as a form of a learned DM, at a higher level than the DM virtual machine and able to direct the SC’s responses to a more emotionally appropriate part of its repertoire. © 2010 Elsevier Ltd. All rights reserved.peer-reviewe

OAR@UM

Human Motion Trajectory Prediction: A Survey

Author: Arras Kai O.
Gavrila Dariu M.
Herman Michael
Kitani Kris M.
Palmieri Luigi
Rudenko Andrey
Publication venue: 'SAGE Publications'
Publication date: 17/12/2019
Field of study

With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

arXiv.org e-Print Archive

Machine Learning-Aided Operations and Communications of Unmanned Aerial Vehicles: A Contemporary Survey

Author: Hossain Ekram
Huang Hailong
Kurunathan Harrison
Li Kai
Ni Wei
Publication venue
Publication date: 07/11/2022
Field of study

The ongoing amalgamation of UAV and ML techniques is creating a significant synergy and empowering UAVs with unprecedented intelligence and autonomy. This survey aims to provide a timely and comprehensive overview of ML techniques used in UAV operations and communications and identify the potential growth areas and research gaps. We emphasise the four key components of UAV operations and communications to which ML can significantly contribute, namely, perception and feature extraction, feature interpretation and regeneration, trajectory and mission planning, and aerodynamic control and operation. We classify the latest popular ML tools based on their applications to the four components and conduct gap analyses. This survey also takes a step forward by pointing out significant challenges in the upcoming realm of ML-aided automated UAV operations and communications. It is revealed that different ML techniques dominate the applications to the four key modules of UAV operations and communications. While there is an increasing trend of cross-module designs, little effort has been devoted to an end-to-end ML framework, from perception and feature extraction to aerodynamic control and operation. It is also unveiled that the reliability and trust of ML in UAV operations and applications require significant attention before full automation of UAVs and potential cooperation between UAVs and humans come to fruition.Comment: 36 pages, 304 references, 19 Figure

arXiv.org e-Print Archive