17 research outputs found
Efficiently Finding Approximately-Optimal Queries for Improving Policies and Guaranteeing Safety
When a computational agent (called the “robot”) takes actions on behalf of a human user, it may be uncertain about the human’s preferences. The human may initially specify her preferences incompletely or inaccurately. In this case, the robot’s performance may be unsatisfactory or even cause negative side effects to the environment. There are approaches in the literature that may solve this problem. For example, the human can provide some demonstrations which clarify the robot’s uncertainty. The human may give real-time feedback to the robot’s behavior, or monitor the robot and stop the robot when it may perform anything dangerous. However, these methods typically require much of the human’s attention. Alternatively, the robot may estimate the human’s true preferences using the specified preferences, but this is error-prone and requires making assumptions on how the human specifies her preferences.
In this thesis, I consider a querying approach. Before taking any actions, the robot has a chance to query the human about her preferences. For example, the robot may query the human about which trajectory in a set of trajectories she likes the most, or whether the human cares about some side effects to the domain. After the human responds to the query, the robot expects to improve its performance and/or guarantee that its behavior is considered safe by the human.
If we do not impose any constraint on the number of queries the robot can pose, the robot may keep posing queries until it is absolutely certain about the human’s preferences. This may consume too much of the human’s cognitive load. The information obtained in the responses to some of the queries may only marginally improve the robot’s performance, which is not worth the human’s attention at all. So in the problems considered in this thesis, I constrain the number of queries that the robot can pose, or associate each query with a cost. The research question is how to efficiently find the most useful query under such constraints.
Finding a provably optimal query can be challenging since it is usually a combinatorial optimization problem. In this thesis, I contribute to providing efficient query selection algorithms under uncertainty. I first formulate the robot’s uncertainty as reward uncertainty and safety-constraint uncertainty. Under only reward uncertainty, I provide a query selection algorithm that finds approximately-optimal k-response queries. Under only safety-constraint uncertainty, I provide a query selection algorithm that finds an optimal k-element query to improve a known safe policy, and an algorithm that uses a set-cover-based query selection strategy to find an initial safe policy. Under both types of uncertainty simultaneously, I provide a batch-query-based querying method that empirically outperforms other baseline querying methods.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163125/1/shunzh_1.pd
Making and Keeping Probabilistic Commitments for Trustworthy Multiagent Coordination
In a large number of real world domains, such as the control of autonomous vehicles, team sports, medical diagnosis and treatment, and many others, multiple autonomous agents need to take actions based on local observations, and are interdependent in the sense that they rely on each other to accomplish tasks. Thus, achieving desired outcomes in these domains requires interagent coordination. The form of coordination this thesis focuses on is commitments, where an agent, referred to as the commitment provider, specifies guarantees about its behavior to another, referred to as the commitment recipient, so that the recipient can plan and execute accordingly without taking into account the details of the provider's behavior. This thesis grounds the concept of commitments into decision-theoretic settings where the provider's guarantees might have to be probabilistic when its actions have stochastic outcomes and it expects to reduce its uncertainty about the environment during execution.
More concretely, this thesis presents a set of contributions that address three core issues for commitment-based coordination: probabilistic commitment adherence, interpretation, and formulation. The first contribution is a principled semantics for the provider to exercise maximal autonomy that responds to evolving knowledge about the environment without violating its probabilistic commitment, along with a family of algorithms for the provider to construct policies that provably respect the semantics and make explicit tradeoffs between computation cost and plan quality. The second contribution consists of theoretical analyses and empirical studies that improve our understanding of the recipient's interpretation of the partial information specified in a probabilistic commitment; the thesis shows that it is inherently easier for the recipient to robustly model a probabilistic commitment where the provider promises to enable preconditions that the recipient requires than where the provider instead promises to avoid changing already-enabled preconditions. The third contribution focuses on the problem of formulating probabilistic commitments for the fully cooperative provider and recipient; the thesis proves structural properties of the agents' values as functions of the parameters of the commitment specification that can be exploited to achieve orders of magnitude less computation for 1) formulating optimal commitments in a centralized manner, and 2) formulating (approximately) optimal queries that induce (approximately) optimal commitments for the decentralized setting in which information relevant to optimization is distributed among the agents.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162948/1/qizhg_1.pd
Recommended from our members
Reliable Decision-Making with Imprecise Models
The rapid growth in the deployment of autonomous systems across various sectors has generated considerable interest in how these systems can operate reliably in large, stochastic, and unstructured environments. Despite recent advances in artificial intelligence and machine learning, it is challenging to assure that autonomous systems will operate reliably in the open world. One of the causes of unreliable behavior is the impreciseness of the model used for decision-making. Due to the practical challenges in data collection and precise model specification, autonomous systems often operate based on models that do not represent all the details in the environment. Even if the system has access to a comprehensive decision-making model that accounts for all the details in the environment and all possible scenarios the agent may encounter, it may be intractable to solve this complex model optimally. Consequently, this complex, high fidelity model may be simplified to accelerate planning, introducing imprecision. Reasoning with such imprecise models affects the reliability of autonomous systems. A system\u27s actions may sometimes produce unexpected, undesirable consequences, which are often identified after deployment. How can we design autonomous systems that can operate reliably in the presence of uncertainty and model imprecision?
This dissertation presents solutions to address three classes of model imprecision in a Markov decision process, along with an analysis of the conditions under which bounded-performance can be guaranteed. First, an adaptive outcome selection approach is introduced to devise risk-aware reduced models of the environment that efficiently balance the trade-off between model simplicity and fidelity, to accelerate planning in resource-constrained settings. Second, a framework that extends stochastic shortest path framework to problems with imperfect information about the goal state during planning is introduced, along with two solution approaches to solve this problem. Finally, two complementary solution approaches are presented to minimize the negative side effects of agent actions. The techniques presented in this dissertation enable an autonomous system to detect and mitigate undesirable behavior, without redesigning the model entirely