2 research outputs found
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Learning Temporal Dynamics of Human-Robot Interactions from Demonstrations
The presence of robots in society is becoming increasingly common, triggering the need to learn reliable policies to automate human-robot interactions (HRI). Manually developing policies for HRI is particularly challenging due to the complexity introduced by the human component. The aim of this thesis is to explore the benefits of leveraging temporal reasoning to learn policies for HRIs from demonstrations. This thesis proposes and evaluates two distinct temporal reasoning approaches. The first one consists of a temporal-reasoning-based learning from demonstration (TR-LfD) framework that employs a variant of an Interval Temporal Bayesian Network to learn the temporal dynamics of an interaction. TR-LfD exploits Allen’s interval algebra (IA) and Bayesian networks to effectively learn complex temporal structures. The second approach consists of a novel temporal reasoning model, the Temporal Context Graph (TCG). TCGs combine IA, n-grams models, and directed graphs to model interactions with cyclical atomic actions and temporal structures with sequential and parallel relationships. The proposed temporal reasoning models are evaluated using two experiments consisting of autonomous robot-mediated behavioral interventions. Results indicate that leveraging temporal reasoning can improve policy generation and execution in LfD frameworks. Specifically, these models can be used to limit the action space of a robot during an interaction, thus simplifying policy selection and effectively addressing the issue of perceptual aliasing