17,720 research outputs found
Model-based Bayesian Reinforcement Learning for Dialogue Management
Reinforcement learning methods are increasingly used to optimise dialogue
policies from experience. Most current techniques are model-free: they directly
estimate the utility of various actions, without explicit model of the
interaction dynamics. In this paper, we investigate an alternative strategy
grounded in model-based Bayesian reinforcement learning. Bayesian inference is
used to maintain a posterior distribution over the model parameters, reflecting
the model uncertainty. This parameter distribution is gradually refined as more
data is collected and simultaneously used to plan the agent's actions. Within
this learning framework, we carried out experiments with two alternative
formalisations of the transition model, one encoded with standard multinomial
distributions, and one structured with probabilistic rules. We demonstrate the
potential of our approach with empirical results on a user simulator
constructed from Wizard-of-Oz data in a human-robot interaction scenario. The
results illustrate in particular the benefits of capturing prior domain
knowledge with high-level rules
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users
Static recommendation methods like collaborative filtering suffer from the
inherent limitation of performing real-time personalization for cold-start
users. Online recommendation, e.g., multi-armed bandit approach, addresses this
limitation by interactively exploring user preference online and pursuing the
exploration-exploitation (EE) trade-off. However, existing bandit-based methods
model recommendation actions homogeneously. Specifically, they only consider
the items as the arms, being incapable of handling the item attributes, which
naturally provide interpretable information of user's current demands and can
effectively filter out undesired items. In this work, we consider the
conversational recommendation for cold-start users, where a system can both ask
the attributes from and recommend items to a user interactively. This important
scenario was studied in a recent work. However, it employs a hand-crafted
function to decide when to ask attributes or make recommendations. Such
separate modeling of attributes and items makes the effectiveness of the system
highly rely on the choice of the hand-crafted function, thus introducing
fragility to the system. To address this limitation, we seamlessly unify
attributes and items in the same arm space and achieve their EE trade-offs
automatically using the framework of Thompson Sampling. Our Conversational
Thompson Sampling (ConTS) model holistically solves all questions in
conversational recommendation by choosing the arm with the maximal reward to
play. Extensive experiments on three benchmark datasets show that ConTS
outperforms the state-of-the-art methods Conversational UCB (ConUCB) and
Estimation-Action-Reflection model in both metrics of success rate and average
number of conversation turns.Comment: TOIS 202
Agents for educational games and simulations
This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications
Models of everywhere revisited: a technological perspective
The concept ‘models of everywhere’ was first introduced in the mid 2000s as a means of reasoning about the
environmental science of a place, changing the nature of the underlying modelling process, from one in which
general model structures are used to one in which modelling becomes a learning process about specific places, in
particular capturing the idiosyncrasies of that place. At one level, this is a straightforward concept, but at another
it is a rich multi-dimensional conceptual framework involving the following key dimensions: models of everywhere,
models of everything and models at all times, being constantly re-evaluated against the most current
evidence. This is a compelling approach with the potential to deal with epistemic uncertainties and nonlinearities.
However, the approach has, as yet, not been fully utilised or explored. This paper examines the
concept of models of everywhere in the light of recent advances in technology. The paper argues that, when first
proposed, technology was a limiting factor but now, with advances in areas such as Internet of Things, cloud
computing and data analytics, many of the barriers have been alleviated. Consequently, it is timely to look again
at the concept of models of everywhere in practical conditions as part of a trans-disciplinary effort to tackle the
remaining research questions. The paper concludes by identifying the key elements of a research agenda that
should underpin such experimentation and deployment
- …