13,486 research outputs found

    Human-Machine Collaborative Optimization via Apprenticeship Scheduling

    Full text link
    Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.Comment: Portions of this paper were published in the Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper consists of 50 pages with 11 figures and 4 table

    Copyright and Non-Print Media

    Get PDF

    Robotic Assistance in Coordination of Patient Care

    Get PDF
    We conducted a study to investigate trust in and dependence upon robotic decision support among nurses and doctors on a labor and delivery floor. There is evidence that suggestions provided by embodied agents engender inappropriate degrees of trust and reliance among humans. This concern is a critical barrier that must be addressed before fielding intelligent hospital service robots that take initiative to coordinate patient care. Our experiment was conducted with nurses and physicians, and evaluated the subjects’ levels of trust in and dependence on high- and low-quality recommendations issued by robotic versus computer-based decision support. The support, generated through action-driven learning from expert demonstration, was shown to produce high-quality recommendations that were ac- cepted by nurses and physicians at a compliance rate of 90%. Rates of Type I and Type II errors were comparable between robotic and computer-based decision support. Furthermore, em- bodiment appeared to benefit performance, as indicated by a higher degree of appropriate dependence after the quality of recommendations changed over the course of the experiment. These results support the notion that a robotic assistant may be able to safely and effectively assist in patient care. Finally, we conducted a pilot demonstration in which a robot assisted resource nurses on a labor and delivery floor at a tertiary care center.National Science Foundation (U.S.) (Grant 2388357

    Impaired learning to dissociate advantageous and disadvantageous risky choices in adolescents

    Get PDF
    Adolescence is characterized by a surge in maladaptive risk-taking behaviors, but whether and how this relates to developmental changes in experience-based learning is largely unknown. In this preregistered study, we addressed this issue using a novel task that allowed us to separate the learning-driven optimization of risky choice behavior over time from overall risk-taking tendencies. Adolescents (12–17 years old) learned to dissociate advantageous from disadvantageous risky choices less well than adults (20–35 years old), and this impairment was stronger in early than mid-late adolescents. Computational modeling revealed that adolescents’ suboptimal performance was largely due to an inefficiency in core learning and choice processes. Specifically, adolescents used a simpler, suboptimal, expectation-updating process and a more stochastic choice policy. In addition, the modeling results suggested that adolescents, but not adults, overvalued the highest rewards. Finally, an exploratory latent-mixture model analysis indicated that a substantial proportion of the participants in each age group did not engage in experience-based learning but used a gambler’s fallacy strategy, stressing the importance of analyzing individual differences. Our results help understand why adolescents tend to make more, and more persistent, maladaptive risky decisions than adults when the values of these decisions have to be learned from experience

    Offline Contextual Multi-armed Bandits for Mobile Health Interventions: A Case Study on Emotion Regulation

    Full text link
    Delivering treatment recommendations via pervasive electronic devices such as mobile phones has the potential to be a viable and scalable treatment medium for long-term health behavior management. But active experimentation of treatment options can be time-consuming, expensive and altogether unethical in some cases. There is a growing interest in methodological approaches that allow an experimenter to learn and evaluate the usefulness of a new treatment strategy before deployment. We present the first development of a treatment recommender system for emotion regulation using real-world historical mobile digital data from n = 114 high socially anxious participants to test the usefulness of new emotion regulation strategies. We explore a number of offline contextual bandits estimators for learning and propose a general framework for learning algorithms. Our experimentation shows that the proposed doubly robust offline learning algorithms performed significantly better than baseline approaches, suggesting that this type of recommender algorithm could improve emotion regulation. Given that emotion regulation is impaired across many mental illnesses and such a recommender algorithm could be scaled up easily, this approach holds potential to increase access to treatment for many people. We also share some insights that allow us to translate contextual bandit models to this complex real-world data, including which contextual features appear to be most important for predicting emotion regulation strategy effectiveness.Comment: Accepted at RecSys 202
    • …
    corecore