706 research outputs found
Multi-Objective Intrinsic Reward Learning for Conversational Recommender Systems
Conversational Recommender Systems (CRS) actively elicit user preferences to
generate adaptive recommendations. Mainstream reinforcement learning-based CRS
solutions heavily rely on handcrafted reward functions, which may not be
aligned with user intent in CRS tasks. Therefore, the design of task-specific
rewards is critical to facilitate CRS policy learning, which remains largely
under-explored in the literature. In this work, we propose a novel approach to
address this challenge by learning intrinsic rewards from interactions with
users. Specifically, we formulate intrinsic reward learning as a
multi-objective bi-level optimization problem. The inner level optimizes the
CRS policy augmented by the learned intrinsic rewards, while the outer level
drives the intrinsic rewards to optimize two CRS-specific objectives:
maximizing the success rate and minimizing the number of turns to reach a
successful recommendation in conversations. To evaluate the effectiveness of
our approach, we conduct extensive experiments on three public CRS benchmarks.
The results show that our algorithm significantly improves CRS performance by
exploiting informative learned intrinsic rewards.Comment: 11 page
A Review of Reinforcement Learning for Natural Language Processing, and Applications in Healthcare
Reinforcement learning (RL) has emerged as a powerful approach for tackling
complex medical decision-making problems such as treatment planning,
personalized medicine, and optimizing the scheduling of surgeries and
appointments. It has gained significant attention in the field of Natural
Language Processing (NLP) due to its ability to learn optimal strategies for
tasks such as dialogue systems, machine translation, and question-answering.
This paper presents a review of the RL techniques in NLP, highlighting key
advancements, challenges, and applications in healthcare. The review begins by
visualizing a roadmap of machine learning and its applications in healthcare.
And then it explores the integration of RL with NLP tasks. We examined dialogue
systems where RL enables the learning of conversational strategies, RL-based
machine translation models, question-answering systems, text summarization, and
information extraction. Additionally, ethical considerations and biases in
RL-NLP systems are addressed
On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Transformer, originally devised for natural language processing, has also
attested significant success in computer vision. Thanks to its super expressive
power, researchers are investigating ways to deploy transformers to
reinforcement learning (RL) and the transformer-based models have manifested
their potential in representative RL benchmarks. In this paper, we collect and
dissect recent advances on transforming RL by transformer (transformer-based RL
or TRL), in order to explore its development trajectory and future trend. We
group existing developments in two categories: architecture enhancement and
trajectory optimization, and examine the main applications of TRL in robotic
manipulation, text-based games, navigation and autonomous driving. For
architecture enhancement, these methods consider how to apply the powerful
transformer structure to RL problems under the traditional RL framework, which
model agents and environments much more precisely than deep RL methods, but
they are still limited by the inherent defects of traditional RL algorithms,
such as bootstrapping and "deadly triad". For trajectory optimization, these
methods treat RL problems as sequence modeling and train a joint state-action
model over entire trajectories under the behavior cloning framework, which are
able to extract policies from static datasets and fully use the long-sequence
modeling capability of the transformer. Given these advancements, extensions
and challenges in TRL are reviewed and proposals about future direction are
discussed. We hope that this survey can provide a detailed introduction to TRL
and motivate future research in this rapidly developing field.Comment: 26 page
Reinforcement Learning Approaches in Social Robotics
This article surveys reinforcement learning approaches in social robotics.
Reinforcement learning is a framework for decision-making problems in which an
agent interacts through trial-and-error with its environment to discover an
optimal behavior. Since interaction is a key component in both reinforcement
learning and social robotics, it can be a well-suited approach for real-world
interactions with physically embodied social robots. The scope of the paper is
focused particularly on studies that include social physical robots and
real-world human-robot interactions with users. We present a thorough analysis
of reinforcement learning approaches in social robotics. In addition to a
survey, we categorize existent reinforcement learning approaches based on the
used method and the design of the reward mechanisms. Moreover, since
communication capability is a prominent feature of social robots, we discuss
and group the papers based on the communication medium used for reward
formulation. Considering the importance of designing the reward function, we
also provide a categorization of the papers based on the nature of the reward.
This categorization includes three major themes: interactive reinforcement
learning, intrinsically motivated methods, and task performance-driven methods.
The benefits and challenges of reinforcement learning in social robotics,
evaluation methods of the papers regarding whether or not they use subjective
and algorithmic measures, a discussion in the view of real-world reinforcement
learning challenges and proposed solutions, the points that remain to be
explored, including the approaches that have thus far received less attention
is also given in the paper. Thus, this paper aims to become a starting point
for researchers interested in using and applying reinforcement learning methods
in this particular research field
Proceedings of the 2012 Workshop on Ambient Intelligence Infrastructures (WAmIi)
This is a technical report including the papers presented at the Workshop on Ambient Intelligence Infrastructures (WAmIi) that took place in conjunction with the International Joint Conference on Ambient Intelligence (AmI) in Pisa, Italy on November 13, 2012. The motivation for organizing the workshop was the wish to learn from past experience on Ambient Intelligence systems, and in particular, on the lessons learned on the system architecture of such systems. A significant number of European projects and other research have been performed, often with the goal of developing AmI technology to showcase AmI scenarios. We believe that for AmI to become further successfully accepted the system architecture is essential
- …