Search CORE

3 research outputs found

Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards

Author: Choi Sungja
Cuayahuitl Heriberto
Hwang Inchul
Kim Jihie
Lee Donghyeon
Ryu Seonghan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/07/2019
Field of study

Training chatbots using the reinforcement learning paradigm is challenging due to high-dimensional states, infinite action spaces and the difficulty in specifying the reward function. We address such problems using clustered actions instead of infinite actions, and a simple but promising reward function based on human-likeness scores derived from human-human dialogue data. We train Deep Reinforcement Learning (DRL) agents using chitchat data in raw text—without any manual annotations. Experimental results using different splits of training data report the following. First, that our agents learn reasonable policies in the environments they get familiarised with, but their performance drops substantially when they are exposed to a test set of unseen dialogues. Second, that the choice of sentence embedding size between 100 and 300 dimensions is not significantly different on test data. Third, that our proposed human-likeness rewards are reasonable for training chatbots as long as they use lengthy dialogue histories of ≥10 sentences

University of Lincoln Institutional Repository

arXiv.org e-Print Archive

Crossref

Ensemble-Based Deep Reinforcement Learning for Chatbots

Author: Cho Yongjin
Choi Hyungtak
Choi Sungja
Cuayahuitl Heriberto
Hwang Inchul
Indurthi Satish
Kim Jihie
Lee Donghyeon
Ryu Seonghan
Yu Seunghak
Publication venue: 'Elsevier BV'
Publication date: 27/08/2019
Field of study

Trainable chatbots that exhibit fluent and human-like conversations remain a big challenge in artificial intelligence. Deep Reinforcement Learning (DRL) is promising for addressing this challenge, but its successful application remains an open question. This article describes a novel ensemble-based approach applied to value-based DRL chatbots, which use finite action sets as a form of meaning representation. In our approach, while dialogue actions are derived from sentence clustering, the training datasets in our ensemble are derived from dialogue clustering. The latter aim to induce specialised agents that learn to interact in a particular style. In order to facilitate neural chatbot training using our proposed approach, we assume dialogue data in raw text only – without any manually-labelled data. Experimental results using chitchat data reveal that (1) near human-like dialogue policies can be induced, (2) generalisation to unseen data is a difficult problem, and (3) training an ensemble of chatbot agents is essential for improved performance over using a single agent. In addition to evaluations using held-out data, our results are further supported by a human evaluation that rated dialogues in terms of fluency, engagingness and consistency – which revealed that our proposed dialogue rewards strongly correlate with human judgements

University of Lincoln Institutional Repository

arXiv.org e-Print Archive

p-hydroxybenzyl alcohol prevents brain injury and behavioral impairment by activating Nrf2, PDI, and neurotrophic factor genes in a rat model of brain ischemia

Author: A.C. DeVries
Angela M. A. Anthony Jalin
B.K. Harvey
C.R. Wu
Chae Kwan Lee
D.A. Bloom
E. Cadenas
E. Descamps
H. Miyazaki
H. Taguchi
H.J. Kim
H.S. Ko
I.K. Hwang
J. Chen
J. Liu
J.C. Conover
Jeong Hwa Hong
K. Itoh
K. Itoh
K.I. Tong
Kyung-Yoon Kam
M.-T. Hsieh
M.D. Hill
M.H. Won
M.P. Mattson
M.P. Mattson
M.V. Sofroniew
Nahee Jeong
R. Mandel
R.B. Freedman
S. Tanaka
S.J. Yu
S.S. Yu
Seong Jin Yu
Sung Goo Kang
Sungja Lee
T. Uehara
T.Y. Jung
U. Dirnagl
W.D. Snider
W.S. Jeong
W.S. Jeong
X. Fu
Y.J. Surh
Yong Won Choi
Z. Kokaia
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref