Search CORE

5,241 research outputs found

Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems

Author: Gasic Milica
Mrksic Nikola
Su Pei-Hao
Vandyke David
Wen Tsung-Hsien
Young Steve
Publication venue
Publication date: 01/01/2015
Field of study

Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality. Most NLG systems in common use employ rules and heuristics and tend to generate rigid and stylised responses without the natural variation of human language. They are also not easily scaled to systems covering multiple domains and languages. This paper presents a statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure. The LSTM generator can learn from unaligned data by jointly optimising sentence planning and surface realisation using a simple cross entropy training criterion, and language variation can be easily achieved by sampling from output candidates. With fewer heuristics, an objective evaluation in two differing test domains showed the proposed method improved performance compared to previous methods. Human judges scored the LSTM system higher on informativeness and naturalness and overall preferred it to the other systems.Comment: To be appear in EMNLP 201

arXiv.org e-Print Archive

Crossref

Optimising Spoken Dialogue Strategies within the Reinforcement Learning Paradigm

Author: Pietquin Olivier
Publication venue: 'IntechOpen'
Publication date: 01/01/2008
Field of study

Optimising Spoken Dialogue Strategies within the Reinforcement Learning Paradig

Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

Author: Gašić M
Kim D
Mrkšić N
Su PH
Vandyke D
Wen TH
Young S
Publication venue
Publication date: 01/01/2015
Field of study

The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.Comment: To be appear in SigDial 201

arXiv.org e-Print Archive

Crossref

CUED - Cambridge University Engineering Department

Crowd-sourcing NLG Data: Pictures Elicit Better Data

Author: Lemon Oliver
Novikova Jekaterina
Rieser Verena
Publication venue
Publication date: 01/01/2016
Field of study

Recent advances in corpus-based Natural Language Generation (NLG) hold the promise of being easily portable across domains, but require costly training data, consisting of meaning representations (MRs) paired with Natural Language (NL) utterances. In this work, we propose a novel framework for crowdsourcing high quality NLG training data, using automatic quality control measures and evaluating different MRs with which to elicit data. We show that pictorial MRs result in better NL data being collected than logic-based MRs: utterances elicited by pictorial MRs are judged as significantly more natural, more informative, and better phrased, with a significant increase in average quality ratings (around 0.5 points on a 6-point scale), compared to using the logical MRs. As the MR becomes more complex, the benefits of pictorial stimuli increase. The collected data will be released as part of this submission.Comment: The 9th International Natural Language Generation conference INLG, 2016. 10 pages, 2 figures, 3 table

arXiv.org e-Print Archive

Crossref

Heriot Watt Pure

Using multimedia to enhance the accessibility of the learning environment for disabled students: reflections from the Skills for Access project

Author: Gregor Peter
Sloan David
Stratford John
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2006
Field of study

As educators' awareness of their responsibilities towards ensuring the accessibility of the learning environment to disabled students increases, significant debate surrounds the implications of accessibility requirements on educational multimedia. There would appear to be widespread concern that the fundamental principles of creating accessible web‐based materials seem at odds with the creative and innovative use of multimedia to support learning and teaching, as well as concerns over the time and cost of providing accessibility features that can hold back resource development and application. Yet, effective use of multimedia offers a way of enhancing the accessibility of the learning environment for many groups of disabled students. Using the development of ‘Skills for Access’, a web resource supporting the dual aims of creating optimally accessible multimedia for learning, as an example, the attitudinal, practical and technical challenges facing the effective use of multimedia as an accessibility aid in a learning environment will be explored. Reasons why a holistic approach to accessibility may be the most effective in ensuring that multimedia reaches its full potential in enabling and supporting students in learning, regardless of any disability they may have, will be outlined and discussed

Crossref

ALT Open Access Repository

Directory of Open Access Journals

Scaling up deep reinforcement learning for multi-domain dialogue systems

Author: Carse Jacob
Cuayahuitl Heriberto
Williamson Ashley
Yu Seunghak
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2017
Field of study

Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning—termed NDQN, and applies it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state representations by compressing raw inputs; and the third stage applies a pre-training phase for bootstraping the behaviour of agents in the network. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that the proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems. An additional evaluation reports that the NDQN agents outperformed a K-Nearest Neighbour baseline in task success and dialogue length, yielding more efficient and successful dialogues

University of Lincoln Institutional Repository

Introduction for speech and language for interactive robots

Author: Argentieri
Athanasopoulos
Bordes
Chen
Cuayáhuitl
Cuayáhuitl
Ferreira
Gabriel Skantze
Heriberto Cuayáhuitl
Kazunori Komatani
Lee
Lison
Lorenzo-Trueba
Mavridis
Misu
Nose
Yoshino
Zukerman
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

This special issue includes research articles which apply spoken language processing to robots that interact with human users through speech, possibly combined with other modalities. Robots that can listen to human speech, understand it, interact according to the conveyed meaning, and respond represent major research and technological challenges. Their common aim is to equip robots with natural interaction abilities. However, robotics and spoken language processing are areas that are typically studied within their respective communities with limited communication across disciplinary boundaries. The articles in this special issue represent examples that address the need for an increased multidisciplinary exchange of ideas

University of Lincoln Institutional Repository

Crossref

Heriot Watt Pure

Conversational natural language interaction for place-related knowledge acquisition

Author: Bartie Phil
Dalmas Tiphaine
Goetze Jana
Janarthanam Srini
Lemon Oliver
Liu Xingkun
Mackaness William
Publication venue: Kloster Seeon, Germany
Publication date: 01/01/2012
Field of study

We focus on the problems of using Natural Language inter- action to support pedestrians in their place-related knowledge acquisi- tion. Our case study for this discussion is a smartphone-based Natu- ral Language interface that allows users to acquire spatial and cultural knowledge of a city. The framework consists of a spoken dialogue-based information system and a smartphone client. The system is novel in com- bining geographic information system (GIS) modules such as a visibility engine with a question-answering (QA) system. Users can use the smart- phone client to engage in a variety of interleaved conversations such as navigating from A to B, using the QA functionality to learn more about points of interest (PoI) nearby, and searching for amenities and tourist attractions. This system explores a variety of research questions involving Natural Language interaction for acquisition of knowledge about space and place

Heriot Watt Pure

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Generating multimedia presentations: from plain text to screenplay

Author: A. Jameson
A. Rubinstein
B. Krenn
C. Hartshorne
E. Reiter
E. .H. Hovy
G. Nunberg
H. Kamp
H. Levesque
J. Bateman
J. Kuppevelt Van
K. Deemter Van
N. Bouayad-Agha
P. Grice
Paraboni
R. Dale
W. Wahlster
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

In many Natural Language Generation (NLG) applications, the output is limited to plain text – i.e., a string of words with punctuation and paragraph breaks, but no indications for layout, or pictures, or dialogue. In several projects, we have begun to explore NLG applications in which these extra media are brought into play. This paper gives an informal account of what we have learned. For coherence, we focus on the domain of patient information leaflets, and follow an example in which the same content is expressed first in plain text, then in formatted text, then in text with pictures, and finally in a dialogue script that can be performed by two animated agents. We show how the same meaning can be mapped to realisation patterns in different media, and how the expanded options for expressing meaning are related to the perceived style and tone of the presentation. Throughout, we stress that the extra media are not simple added to plain text, but integrated with it: thus the use of formatting, or pictures, or dialogue, may require radical rewording of the text itself

Crossref

Open Research Online (The Open University)