Search CORE

1,926 research outputs found

Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems

Author: de Rijke Maarten
Kiseleva Julia
Li Ziming
Publication venue
Publication date: 01/01/2020
Field of study

Dialogue policy learning for task-oriented dialogue systems has enjoyed great progress recently mostly through employing reinforcement learning methods. However, these approaches have become very sophisticated. It is time to re-evaluate it. Are we really making progress developing dialogue agents only based on reinforcement learning? We demonstrate how (1)~traditional supervised learning together with (2)~a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods. First, we introduce a simple dialogue action decoder to predict the appropriate actions. Then, the traditional multi-label classification solution for dialogue policy learning is extended by adding dense layers to improve the dialogue agent performance. Finally, we employ the Gumbel-Softmax estimator to alternatively train the dialogue agent and the dialogue reward model without using reinforcement learning. Based on our extensive experimentation, we can conclude the proposed methods can achieve more stable and higher performance with fewer efforts, such as the domain knowledge required to design a user simulator and the intractable parameter tuning in reinforcement learning. Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.Comment: 10 page

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Reinforcement Learning for Generative AI: A Survey

Author: Cao Yuanjiang
McAuley Julian
Sheng Quan Z.
Yao Lina
Publication venue
Publication date: 28/08/2023
Field of study

Deep Generative AI has been a long-standing essential topic in the machine learning community, which can impact a number of application areas like text generation and computer vision. The major paradigm to train a generative model is maximum likelihood estimation, which pushes the learner to capture and approximate the target data distribution by decreasing the divergence between the model distribution and the target distribution. This formulation successfully establishes the objective of generative tasks, while it is incapable of satisfying all the requirements that a user might expect from a generative model. Reinforcement learning, serving as a competitive option to inject new training signals by creating new objectives that exploit novel signals, has demonstrated its power and flexibility to incorporate human inductive bias from multiple angles, such as adversarial learning, hand-designed rules and learned reward model to build a performant model. Thereby, reinforcement learning has become a trending research field and has stretched the limits of generative AI in both model design and application. It is reasonable to summarize and conclude advances in recent years with a comprehensive review. Although there are surveys in different application areas recently, this survey aims to shed light on a high-level review that spans a range of application areas. We provide a rigorous taxonomy in this area and make sufficient coverage on various models and applications. Notably, we also surveyed the fast-developing large language model area. We conclude this survey by showing the potential directions that might tackle the limit of current models and expand the frontiers for generative AI

arXiv.org e-Print Archive

Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems

Author: de Rijke M.
Kiseleva J.
Li Z.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/11/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges

Author: Franceschelli Giorgio
Musolesi Mirco
Publication venue
Publication date: 08/02/2024
Field of study

Generative Artificial Intelligence (AI) is one of the most exciting developments in Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has emerged as a very successful paradigm for a variety of machine learning tasks. In this survey, we discuss the state of the art, opportunities and open research questions in applying RL to generative AI. In particular, we will discuss three types of applications, namely, RL as an alternative way for generation without specified objectives; as a way for generating outputs while concurrently maximizing an objective function; and, finally, as a way of embedding desired characteristics, which cannot be easily captured by means of an objective function, into the generative process. We conclude the survey with an in-depth discussion of the opportunities and challenges in this fascinating emerging area.Comment: Published in JAIR at https://www.jair.org/index.php/jair/article/view/1527

arXiv.org e-Print Archive

Guided Dialogue Policy Learning without Adversarial Learning in the Loop

Author: de Rijke M.
Gao J.
Kiseleva J.
Lee S.
Li J.
Li Z.
Peng B.
Shayandeh S.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/11/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Guided Dialogue Policy Learning without Adversarial Learning in the Loop

Author: de Rijke M.
Gao J.
Kiseleva J.
Lee S.
Li J.
Li Z.
Peng B.
Shayandeh S.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE