13,437 research outputs found
Learning Correlations between Linguistic Indicators and Semantic Constraints: Reuse of Context-Dependent Descriptions of Entities
This paper presents the results of a study on the semantic constraints
imposed on lexical choice by certain contextual indicators. We show how such
indicators are computed and how correlations between them and the choice of a
noun phrase description of a named entity can be automatically established
using supervised learning. Based on this correlation, we have developed a
technique for automatic lexical choice of descriptions of entities in text
generation. We discuss the underlying relationship between the pragmatics of
choosing an appropriate description that serves a specific purpose in the
automatically generated text and the semantics of the description itself. We
present our work in the framework of the more general concept of reuse of
linguistic structures that are automatically extracted from large corpora. We
present a formal evaluation of our approach and we conclude with some thoughts
on potential applications of our method.Comment: 7 pages, uses colacl.sty and acl.bst, uses epsfig. To appear in the
Proceedings of the Joint 17th International Conference on Computational
Linguistics 36th Annual Meeting of the Association for Computational
Linguistics (COLING-ACL'98
Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation
Natural language generation (NLG) is a critical component in spoken dialogue
system, which can be divided into two phases: (1) sentence planning: deciding
the overall sentence structure, (2) surface realization: determining specific
word forms and flattening the sentence structure into a string. With the rise
of deep learning, most modern NLG models are based on a sequence-to-sequence
(seq2seq) model, which basically contains an encoder-decoder structure; these
NLG models generate sentences from scratch by jointly optimizing sentence
planning and surface realization. However, such simple encoder-decoder
architecture usually fail to generate complex and long sentences, because the
decoder has difficulty learning all grammar and diction knowledge well. This
paper introduces an NLG model with a hierarchical attentional decoder, where
the hierarchy focuses on leveraging linguistic knowledge in a specific order.
The experiments show that the proposed method significantly outperforms the
traditional seq2seq model with a smaller model size, and the design of the
hierarchical attentional decoder can be applied to various NLG systems.
Furthermore, different generation strategies based on linguistic patterns are
investigated and analyzed in order to guide future NLG research work.Comment: accepted by the 7th IEEE Workshop on Spoken Language Technology (SLT
2018). arXiv admin note: text overlap with arXiv:1808.0274
Estudios acerca del establecimiento de conexiones entre enunciados hablados: ¿qué pueden contribuir a la promoción de la construcción de una representación coherente del discurso por parte de los estudiantes?
The aim of this article is to provide an overview of how the establishment of discourse connections among spoken statements has been studied by approaches to discourse analysis and psycholinguistic studies, in order to highlight what variables appear to be important for understanding how comprehension of spoken discourse can be facilitated. The consideration of discourse analysis approaches allows us to think about the role of the establishment of discourse connections among speech acts in the classroom, the uses of contextualization cues by bilingual students, the identification of social and cultural notions in teachers’ discourse, and the interactional effects of teachers’ interventions. Preliminary psycholinguistic studies contribute to our understanding of the role of establishing causal connections and integrating adjacent statements through the presence of discourse markers in the comprehension of spoken discourse by college students. The results of these approaches and studies provide insight into students’ comprehension of classroom discourse, and hold the potential for implications for instruction.El propósito de este artÃculo es realizar un recorrido a través de enfoques de análisis del discurso y estudios de psicolingüÃstica que han investigado el establecimiento de conexiones entre enunciados hablados, a fin de destacar las variables que parecen ser centrales para facilitar la comprensión. La consideración de los enfoques del análisis del discurso nos permitirán pensar acerca del rol del establecimiento de conexiones entre actos del lenguaje en el aula, las funciones de las claves de contextualización, la identificación de las nociones sociales y culturales en el discurso de los profesores, los efectos de las intervenciones de los profesores en la interacción con los estudiantes. Los estudios preliminares de psicolingüÃstica contribuirán a nuestra comprensión del rol del establecimiento de conexiones causales e integración de enunciados adyacentes a través de marcadores del discurso por parte de estudiantes universitarios. La consideración de estos enfoques y estudios nos ayudarán a pensar acerca de las contribuciones que sus propuestas y métodos pueden hacer al enriquecimiento de nuestro entendimiento de cómo los estudiantes comprenden el discurso producido durante las clases.Fil: Yomha Cevasco, Jazmin. Universidad de Buenos Aires; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: Broek, Paul van den. Leiden University; PaÃses Bajo
Learning to generate one-sentence biographies from Wikidata
We investigate the generation of one-sentence Wikipedia biographies from
facts derived from Wikidata slot-value pairs. We train a recurrent neural
network sequence-to-sequence model with attention to select facts and generate
textual summaries. Our model incorporates a novel secondary objective that
helps ensure it generates sentences that contain the input facts. The model
achieves a BLEU score of 41, improving significantly upon the vanilla
sequence-to-sequence model and scoring roughly twice that of a simple template
baseline. Human preference evaluation suggests the model is nearly as good as
the Wikipedia reference. Manual analysis explores content selection, suggesting
the model can trade the ability to infer knowledge against the risk of
hallucinating incorrect information
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
- …