7,624 research outputs found

    Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles

    Full text link
    Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans in ~7-10 Hz for various challenging road navigation and intersection negotiation scenarios

    Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies

    Get PDF
    Abstract We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbing (PHC-WoLF) algorithms, varying the scenario complexity (state space size), the number of training episodes, the learning rate, and the exploration rate. Our results show that generally Q-learning fails to converge whereas PHC and PHC-WoLF always converge and perform similarly. We also show that very high gradually decreasing exploration rates are required for convergence. We conclude that multiagent RL of dialogue policies is a promising alternative to using single-agent RL and SUs or learning directly from corpora

    New ways of working in acute inpatient care: a case for change

    Get PDF
    This position paper focuses on the current tensions and challenges of aligning inpatient care with innovations in mental health services. It argues that a cultural shift is required within inpatient services. Obstacles to change including traditional perceptions of the role and responsibilities of the psychiatrist are discussed. The paper urges all staff working in acute care to reflect on the service that they provide, and to consider how the adoption of new ways of working might revolutionise the organisational culture. This cultural shift offers inpatient staff the opportunity to fully utilise their expertise. New ways of working may be perceived as a threat to existing roles and responsibilities or as an exciting opportunity for professional development with increased job satisfaction. Above all, the move to new ways of working, which is gathering pace throughout the UK, could offer service users1 a quality of care that meets their needs and expectations
    • 

    corecore