17,720 research outputs found

    Model-based Bayesian Reinforcement Learning for Dialogue Management

    Get PDF
    Reinforcement learning methods are increasingly used to optimise dialogue policies from experience. Most current techniques are model-free: they directly estimate the utility of various actions, without explicit model of the interaction dynamics. In this paper, we investigate an alternative strategy grounded in model-based Bayesian reinforcement learning. Bayesian inference is used to maintain a posterior distribution over the model parameters, reflecting the model uncertainty. This parameter distribution is gradually refined as more data is collected and simultaneously used to plan the agent's actions. Within this learning framework, we carried out experiments with two alternative formalisations of the transition model, one encoded with standard multinomial distributions, and one structured with probabilistic rules. We demonstrate the potential of our approach with empirical results on a user simulator constructed from Wizard-of-Oz data in a human-robot interaction scenario. The results illustrate in particular the benefits of capturing prior domain knowledge with high-level rules

    Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users

    Full text link
    Static recommendation methods like collaborative filtering suffer from the inherent limitation of performing real-time personalization for cold-start users. Online recommendation, e.g., multi-armed bandit approach, addresses this limitation by interactively exploring user preference online and pursuing the exploration-exploitation (EE) trade-off. However, existing bandit-based methods model recommendation actions homogeneously. Specifically, they only consider the items as the arms, being incapable of handling the item attributes, which naturally provide interpretable information of user's current demands and can effectively filter out undesired items. In this work, we consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively. This important scenario was studied in a recent work. However, it employs a hand-crafted function to decide when to ask attributes or make recommendations. Such separate modeling of attributes and items makes the effectiveness of the system highly rely on the choice of the hand-crafted function, thus introducing fragility to the system. To address this limitation, we seamlessly unify attributes and items in the same arm space and achieve their EE trade-offs automatically using the framework of Thompson Sampling. Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play. Extensive experiments on three benchmark datasets show that ConTS outperforms the state-of-the-art methods Conversational UCB (ConUCB) and Estimation-Action-Reflection model in both metrics of success rate and average number of conversation turns.Comment: TOIS 202

    Agents for educational games and simulations

    Get PDF
    This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications

    Models of everywhere revisited: a technological perspective

    Get PDF
    The concept ‘models of everywhere’ was first introduced in the mid 2000s as a means of reasoning about the environmental science of a place, changing the nature of the underlying modelling process, from one in which general model structures are used to one in which modelling becomes a learning process about specific places, in particular capturing the idiosyncrasies of that place. At one level, this is a straightforward concept, but at another it is a rich multi-dimensional conceptual framework involving the following key dimensions: models of everywhere, models of everything and models at all times, being constantly re-evaluated against the most current evidence. This is a compelling approach with the potential to deal with epistemic uncertainties and nonlinearities. However, the approach has, as yet, not been fully utilised or explored. This paper examines the concept of models of everywhere in the light of recent advances in technology. The paper argues that, when first proposed, technology was a limiting factor but now, with advances in areas such as Internet of Things, cloud computing and data analytics, many of the barriers have been alleviated. Consequently, it is timely to look again at the concept of models of everywhere in practical conditions as part of a trans-disciplinary effort to tackle the remaining research questions. The paper concludes by identifying the key elements of a research agenda that should underpin such experimentation and deployment
    • …
    corecore