13,486 research outputs found
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Robotic Assistance in Coordination of Patient Care
We conducted a study to investigate trust in and
dependence upon robotic decision support among nurses and
doctors on a labor and delivery floor. There is evidence that
suggestions provided by embodied agents engender inappropriate
degrees of trust and reliance among humans. This concern is a
critical barrier that must be addressed before fielding intelligent
hospital service robots that take initiative to coordinate patient
care. Our experiment was conducted with nurses and physicians,
and evaluated the subjects’ levels of trust in and dependence
on high- and low-quality recommendations issued by robotic
versus computer-based decision support. The support, generated
through action-driven learning from expert demonstration, was
shown to produce high-quality recommendations that were ac-
cepted by nurses and physicians at a compliance rate of 90%.
Rates of Type I and Type II errors were comparable between
robotic and computer-based decision support. Furthermore, em-
bodiment appeared to benefit performance, as indicated by a
higher degree of appropriate dependence after the quality of
recommendations changed over the course of the experiment.
These results support the notion that a robotic assistant may
be able to safely and effectively assist in patient care. Finally,
we conducted a pilot demonstration in which a robot assisted
resource nurses on a labor and delivery floor at a tertiary care
center.National Science Foundation (U.S.) (Grant 2388357
Impaired learning to dissociate advantageous and disadvantageous risky choices in adolescents
Adolescence is characterized by a surge in maladaptive risk-taking behaviors, but whether and how this relates to developmental changes in experience-based learning is largely unknown. In this preregistered study, we addressed this issue using a novel task that allowed us to separate the learning-driven optimization of risky choice behavior over time from overall risk-taking tendencies. Adolescents (12–17 years old) learned to dissociate advantageous from disadvantageous risky choices less well than adults (20–35 years old), and this impairment was stronger in early than mid-late adolescents. Computational modeling revealed that adolescents’ suboptimal performance was largely due to an inefficiency in core learning and choice processes. Specifically, adolescents used a simpler, suboptimal, expectation-updating process and a more stochastic choice policy. In addition, the modeling results suggested that adolescents, but not adults, overvalued the highest rewards. Finally, an exploratory latent-mixture model analysis indicated that a substantial proportion of the participants in each age group did not engage in experience-based learning but used a gambler’s fallacy strategy, stressing the importance of analyzing individual differences. Our results help understand why adolescents tend to make more, and more persistent, maladaptive risky decisions than adults when the values of these decisions have to be learned from experience
Offline Contextual Multi-armed Bandits for Mobile Health Interventions: A Case Study on Emotion Regulation
Delivering treatment recommendations via pervasive electronic devices such as
mobile phones has the potential to be a viable and scalable treatment medium
for long-term health behavior management. But active experimentation of
treatment options can be time-consuming, expensive and altogether unethical in
some cases. There is a growing interest in methodological approaches that allow
an experimenter to learn and evaluate the usefulness of a new treatment
strategy before deployment. We present the first development of a treatment
recommender system for emotion regulation using real-world historical mobile
digital data from n = 114 high socially anxious participants to test the
usefulness of new emotion regulation strategies. We explore a number of offline
contextual bandits estimators for learning and propose a general framework for
learning algorithms. Our experimentation shows that the proposed doubly robust
offline learning algorithms performed significantly better than baseline
approaches, suggesting that this type of recommender algorithm could improve
emotion regulation. Given that emotion regulation is impaired across many
mental illnesses and such a recommender algorithm could be scaled up easily,
this approach holds potential to increase access to treatment for many people.
We also share some insights that allow us to translate contextual bandit models
to this complex real-world data, including which contextual features appear to
be most important for predicting emotion regulation strategy effectiveness.Comment: Accepted at RecSys 202
- …