20,785 research outputs found

    Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation

    Full text link
    Natural language generation (NLG) is a critical component in spoken dialogue system, which can be divided into two phases: (1) sentence planning: deciding the overall sentence structure, (2) surface realization: determining specific word forms and flattening the sentence structure into a string. With the rise of deep learning, most modern NLG models are based on a sequence-to-sequence (seq2seq) model, which basically contains an encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization. However, such simple encoder-decoder architecture usually fail to generate complex and long sentences, because the decoder has difficulty learning all grammar and diction knowledge well. This paper introduces an NLG model with a hierarchical attentional decoder, where the hierarchy focuses on leveraging linguistic knowledge in a specific order. The experiments show that the proposed method significantly outperforms the traditional seq2seq model with a smaller model size, and the design of the hierarchical attentional decoder can be applied to various NLG systems. Furthermore, different generation strategies based on linguistic patterns are investigated and analyzed in order to guide future NLG research work.Comment: accepted by the 7th IEEE Workshop on Spoken Language Technology (SLT 2018). arXiv admin note: text overlap with arXiv:1808.0274

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Ordering adjectives in referential communication

    Get PDF
    We contrasted two hypotheses concerning how speakers determine adjective order during referential communication. The discriminatory efficiency hypotheses claims that speakers place the most discriminating adjective early to facilitate referent identification. By contrast, the availability-based ordering hypothesis assumes that speakers produce most available adjectives early to ease production. Experiment 1 showed that speakers use more pattern-before-color modifier orders (than the reversed) when pattern, not color, distinguished the referent from alternatives, providing support for the discriminatory efficiency hypothesis. Participants also overspecified color more often than pattern, and they generally favored color-before-pattern orders, in support of the availability-based ordering hypothesis. Experiments 2 and 3 replicated both effects in a dialogue setting, where speakers’ adjective ordering was also primed by their partner’s ordering, using conjoined and non-conjoined constructions. We propose a novel model (PASS) that explains how discriminability and availability simultaneously influence adjective selection and ordering via competition in the speaker’s message representation

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    THE CHILD AND THE WORLD: How Children acquire Language

    Get PDF
    HOW CHILDREN ACQUIRE LANGUAGE Over the last few decades research into child language acquisition has been revolutionized by the use of ingenious new techniques which allow one to investigate what in fact infants (that is children not yet able to speak) can perceive when exposed to a stream of speech sound, the discriminations they can make between different speech sounds, differentspeech sound sequences and different words. However on the central features of the mystery, the extraordinarily rapid acquisition of lexicon and complex syntactic structures, little solid progress has been made. The questions being researched are how infants acquire and produce the speech sounds (phonemes) of the community language; how infants find words in the stream of speech; and how they link words to perceived objects or action, that is, discover meanings. In a recent general review in Nature of children's language acquisition, Patricia Kuhl also asked why we do not learn new languages as easily at 50 as at 5 and why computers have not cracked the human linguistic code. The motor theory of language function and origin makes possible a plausible account of child language acquisition generally from which answers can be derived also to these further questions. Why computers so far have been unable to 'crack' the language problem becomes apparent in the light of the motor theory account: computers can have no natural relation between words and their meanings; they have no conceptual store to which the network of words is linked nor do they have the innate aspects of language functioning - represented by function words; computers have no direct links between speech sounds and movement patterns and they do not have the instantly integrated neural patterning underlying thought - they necessarily operate serially and hierarchically. Adults find the acquisition of a new language much more difficult than children do because they are already neurally committed to the link between the words of their first language and the elements in their conceptual store. A second language being acquired by an adult is in direct competition for neural space with the network structures established for the first language
    • …
    corecore