Search CORE

375 research outputs found

Reinforcement Learning for Generative AI: A Survey

Author: Cao Yuanjiang
McAuley Julian
Sheng Quan Z.
Yao Lina
Publication venue
Publication date: 28/08/2023
Field of study

Deep Generative AI has been a long-standing essential topic in the machine learning community, which can impact a number of application areas like text generation and computer vision. The major paradigm to train a generative model is maximum likelihood estimation, which pushes the learner to capture and approximate the target data distribution by decreasing the divergence between the model distribution and the target distribution. This formulation successfully establishes the objective of generative tasks, while it is incapable of satisfying all the requirements that a user might expect from a generative model. Reinforcement learning, serving as a competitive option to inject new training signals by creating new objectives that exploit novel signals, has demonstrated its power and flexibility to incorporate human inductive bias from multiple angles, such as adversarial learning, hand-designed rules and learned reward model to build a performant model. Thereby, reinforcement learning has become a trending research field and has stretched the limits of generative AI in both model design and application. It is reasonable to summarize and conclude advances in recent years with a comprehensive review. Although there are surveys in different application areas recently, this survey aims to shed light on a high-level review that spans a range of application areas. We provide a rigorous taxonomy in this area and make sufficient coverage on various models and applications. Notably, we also surveyed the fast-developing large language model area. We conclude this survey by showing the potential directions that might tackle the limit of current models and expand the frontiers for generative AI

arXiv.org e-Print Archive

Recommended from our members

Data-Driven Policy Optimisation for Multi-Domain Task-Oriented Dialogue

Author: Budzianowski Pawel
Publication venue: University of Cambridge
Publication date: 21/03/2021
Field of study

Recent developments in machine learning along with a general shift in the public attitude towards digital personal assistants has opened new frontiers for conversational systems. Nevertheless, building data-driven multi-domain conversational agents that act optimally given a dialogue context is an open challenge. The first step towards that goal is developing an efficient way of learning a dialogue policy in new domains. Secondly, it is important to have the ability to collect and utilise human-human conversational data to bootstrap an agent's knowledge. The work presented in this thesis demonstrates how a neural dialogue manager fine-tuned with reinforcement learning presents a viable approach for learning a dialogue policy efficiently and across many domains. The thesis starts by introducing a dialogue management module that learns through interactions to act optimally given a current context of a conversation. The current shift towards neural, parameter-rich systems does not fully address the problem of error noise coming from speech recognition or natural language understanding components. A Bayesian approach is therefore proposed to learn more robust and effective policy management in direct interactions without any prior data. By putting a distribution over model weights, the learning agent is less prone to overfit to particular dialogue realizations and a more efficient exploration policy can be therefore employed. The results show that deep reinforcement learning performs on par with non-parametric models even in a low data regime while significantly reducing the computational complexity compared with the previous state-of-the-art. The deployment of a dialogue manager without any pre-training on human conversations is not a viable option from an industry perspective. However, the progress in building statistical systems, particularly dialogue managers, is hindered by the scale of data available. To address this fundamental obstacle, a novel data-collection pipeline entirely based on crowdsourcing without the need for hiring professional annotators is introduced. The validation of the approach results in the collection of the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully labeled collection of human-human written conversations spanning over multiple domains and topics. The proposed dataset creates a set of new benchmarks (belief tracking, policy optimisation, and response generation) significantly raising the complexity of analysed dialogues. The collected dataset serves as a foundation for a novel reinforcement learning (RL)-based approach for training a multi-domain dialogue manager. A Multi-Action and Slot Dialogue Agent (MASDA) is proposed to combat some limitations: 1) handling complex multi-domain dialogues with multiple concurrent actions present in a single turn; and 2) lack of interpretability, which consequently impedes the use of intermediate signals (e.g., dialogue turn annotations) if such signals are available. MASDA explicitly models system acts and slots using intermediate signals, resulting in an improved task-based end-to-end framework. The model can also select concurrent actions in a single turn, thus enriching the representation of the generated responses. The proposed framework allows for RL training of dialogue task completion metrics when dealing with concurrent actions. The results demonstrate the advantages of both 1) handling concurrent actions and 2) exploiting intermediate signals: MASDA outperforms previous end-to-end frameworks while also offering improved scalability.EPSR

Apollo (Cambridge)

Investigation of object oriented programming techniques for embedded generation switchgear design

Author: McCabe Paul Robert
Publication venue: The University of Edinburgh
Publication date: 01/01/2000
Field of study

Edinburgh Research Archive

An artificial intelligence-based collaboration approach in industrial IoT manufacturing : key concepts, architectural extensions and potential applications

Author: Angelopoulos Angelos
Calvo Daniel
Chintamani Keshav
Fernandez Izaskun
Gkonis Panagiotis
Irigaray Aitor Arnaiz
Karkazis Panagiotis
Leligou Nelly
Pariente Tomas
Parreira Josiane Xavier
Petrali Pierluigi
Ramallo-Gonzalez Alfonso P.
Sarakis Lambros
Simoens Pieter
Skarmeta Antonio
Trakadas Panagiotis
Trochoutsos Christos
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

The digitization of manufacturing industry has led to leaner and more efficient production, under the Industry 4.0 concept. Nowadays, datasets collected from shop floor assets and information technology (IT) systems are used in data-driven analytics efforts to support more informed business intelligence decisions. However, these results are currently only used in isolated and dispersed parts of the production process. At the same time, full integration of artificial intelligence (AI) in all parts of manufacturing systems is currently lacking. In this context, the goal of this manuscript is to present a more holistic integration of AI by promoting collaboration. To this end, collaboration is understood as a multi-dimensional conceptual term that covers all important enablers for AI adoption in manufacturing contexts and is promoted in terms of business intelligence optimization, human-in-the-loop and secure federation across manufacturing sites. To address these challenges, the proposed architectural approach builds on three technical pillars: (1) components that extend the functionality of the existing layers in the Reference Architectural Model for Industry 4.0; (2) definition of new layers for collaboration by means of human-in-the-loop and federation; (3) security concerns with AI-powered mechanisms. In addition, system implementation aspects are discussed and potential applications in industrial environments, as well as business impacts, are presented

Ghent University Academic Bibliography

Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review

Author: Božić Bojan
Hamilton Kyle
Longo Luca
Nayak Aparna
Publication venue
Publication date: 30/06/2022
Field of study

Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis.Comment: Surve

arXiv.org e-Print Archive