Search CORE

3 research outputs found

Assessing user specifications of robot behaviour for material transport tasks

Author: Blidaru Alexandru
Publication venue: 'University of Waterloo'
Publication date: 23/09/2019
Field of study

Robots are an established component of many existing manufacturing processes. In the majority of cases, these robots operate in segregated areas, separated from human workers. However, as robots' capabilities in sensing and autonomy improve, they are expected to increasingly operate in human environments, and interact with novice, untrained users. When robots operate in human or shared environments, their tasks and behaviours need to be specified. This is typically performed by a human operator or supervisor, who may specify constraints on robot behaviour to make the robot more predictable or align its behaviour with user expectations. However, these constraints may impact robot task performance. This thesis develops a user interface to obtain robot specifications, and proposes metrics for quantifying the specification quality, and to investigate how users create robot specifications. The metrics relate the robot's performance in the specified environment to its performance in a fully-unconstrained environment, and capture the trade-offs that users make in ensuring that the robot accomplishes its tasks while minimizing the loss of performance. The proposed approach is evaluated in a series of user studies. The first user study sought to understand how novice users provide specifications for an autonomous robot operating in a shared warehouse environment, and to validate the metrics by applying them to user-created specifications. The metrics were then modified based on the results of the pilot study, and employed in a second, larger study. The second study trialed a modified interface and interaction scheme that implemented an interactive preference learning system, aimed at modifying specifications to improve robot performance. The modified metrics were then used to assess the quality of specifications following the preference learning system. The two studies show that inexperienced users create a wide variety of behaviour-limiting specifications, and that they generally have difficulty creating efficient specifications, or assessing their own performance. Furthermore, the preference learning process succeeds in improving specification quality by making them more efficient and more similar between different users. Moreover, users that created specifications of worse initial quality benefit the most from the interactive learning process, as those specifications see a larger improvement

University of Waterloo's Institutional Repository

Specifying User Preferences for Autonomous Robots through Interactive Learning

Author: Wilde Nils
Publication venue: 'University of Waterloo'
Publication date: 02/09/2020
Field of study

This thesis studies a central problem in human-robot interaction (HRI): How can non-expert users specify complex behaviours for autonomous robots? A common technique for robot task specification that does not require expert knowledge is active preference learning. The desired behaviour of a robot is learned by iteratively presenting the user with alternative behaviours of the robot. The user then chooses the alternative they prefer. It is assumed that they make this decision based on an internal, hidden cost function. From the user's choice among the alternatives, the robot learns the hidden user cost function. We use an interactive framework allowing users to create robot task specifications. The behaviour of an autonomous robot can be specified by defining constraints on allowable robot states and actions. For instance, for a mobile robot a user can define traffic rules such as roads, slow zones or areas of avoidance. These constraints form the user-specified terms of the cost function. However, inexperienced users might be oblivious to the impact such constraints have on the robot task performance. Employing an active preference learning framework we present users with the behaviour of the robot following their specification, i.e., the constraints, together with an alternative behaviour where some constraints might be violated. A user cost function trades-off the importance of constraints and the performance of the robot. From the user feedback, the robot learns about the importance of constraints, i.e., parameters in the cost function. We first introduce an algorithm for specification revision that is based on a deterministic user model: We assume that the user always follows the proposed cost function. This allows for dividing the set of possible weights for the user constraints into infeasible and feasible weights whenever user feedback is obtained. In each iteration we present the path the user preferred previously again, together with an alternative path that is optimal for a weight that is feasible with respect to all previous iterations. This path is found with a local search, iterating over the feasible weights until a new path is found. As the number of paths is finite for any discrete motion planner, the algorithm is guaranteed to find the optimal solution within a finite number of iterations. Simulation results show that this approach is suitable to effectively revise user specifications within few iterations. The practicality of the framework is investigated in a user study. The algorithm is extended to learn about multiple tasks for the robot simultaneously, which allows for more realistic scenarios and another active learning component: The choice of task for which the user is presented with two alternative solutions. Through the study we show that nearly all users accept alternative solutions and thus obtain a revised specification through the learning process, leading to a substantial improvement in robot performance. Also, the users whose initial specifications had the largest impact on performance benefit the most from the interactive learning. Next, we weaken the assumptions about the user: In a probabilistic model we do not require the user to always follow our cost function. Based on the sensitivity of a motion planning problem, we show that different values in the user cost function, i.e., weights for the user constraints, do not necessarily lead to different robot behaviour. From the implied discretization of the space of possible parameters we derive an algorithm for efficiently learning a specification revision and demonstrate the performance and robustness in simulations. We build on the notion of sensitivity to an active preference learning technique based on maximum regret, i.e., the maximum error ratio over all possible solutions. We show that active preference learning based on regret substantially outperforms other state of the art approaches. Further, regret based preference learning can be used as an heuristic for both discrete and continuous state and action spaces. An emerging technique for real-time motion planning are state lattice planners, based on a regular discrete set of robot states and pre-computed motions connecting the states, called motion primitives. We study how learning from demonstrations can be used to learn global preferences for robot movement, such as the trade-off between time and jerkiness of the motions. We show how to compute a user optimal set of motion primitives of given size, based on an estimate of the user preferences. We demonstrate that by learning about the motion primitives of a lattice planner, we can shape the robot's behaviour to follow the global user preferences while ensuring good computation time of the motion planner. Furthermore, we study how a robot can simultaneously learn about user preferences on both motions of a lattice planner and parts of the environment when a user is iteratively correcting the robot behaviour. We demonstrate in simulations that this approach is suitable to adapt to user preferences even when the features on the environment that a user considers are not given

University of Waterloo's Institutional Repository