139,668 research outputs found

    Evaluating the Representational Hub of Language and Vision Models

    Get PDF
    The multimodal models used in the emerging field at the intersection of computational linguistics and computer vision implement the bottom-up processing of the `Hub and Spoke' architecture proposed in cognitive science to represent how the brain processes and combines multi-sensory inputs. In particular, the Hub is implemented as a neural network encoder. We investigate the effect on this encoder of various vision-and-language tasks proposed in the literature: visual question answering, visual reference resolution, and visually grounded dialogue. To measure the quality of the representations learned by the encoder, we use two kinds of analyses. First, we evaluate the encoder pre-trained on the different vision-and-language tasks on an existing diagnostic task designed to assess multimodal semantic understanding. Second, we carry out a battery of analyses aimed at studying how the encoder merges and exploits the two modalities.Comment: Accepted to IWCS 201

    Crowd-sourcing NLG Data: Pictures Elicit Better Data

    Full text link
    Recent advances in corpus-based Natural Language Generation (NLG) hold the promise of being easily portable across domains, but require costly training data, consisting of meaning representations (MRs) paired with Natural Language (NL) utterances. In this work, we propose a novel framework for crowdsourcing high quality NLG training data, using automatic quality control measures and evaluating different MRs with which to elicit data. We show that pictorial MRs result in better NL data being collected than logic-based MRs: utterances elicited by pictorial MRs are judged as significantly more natural, more informative, and better phrased, with a significant increase in average quality ratings (around 0.5 points on a 6-point scale), compared to using the logical MRs. As the MR becomes more complex, the benefits of pictorial stimuli increase. The collected data will be released as part of this submission.Comment: The 9th International Natural Language Generation conference INLG, 2016. 10 pages, 2 figures, 3 table

    IMAGINE Final Report

    No full text

    Harmonised Principles for Public Participation in Quality Assurance of Integrated Water Resources Modelling

    Get PDF
    The main purpose of public participation in integrated water resources modelling is to improve decision-making by ensuring that decisions are soundly based on shared knowledge, experience and scientific evidence. The present paper describes stakeholder involvement in the modelling process. The point of departure is the guidelines for quality assurance for `scientific` water resources modelling developed under the EU research project HarmoniQuA, which has developed a computer based Modelling Support Tool (MoST) to provide a user-friendly guidance and a quality assurance framework that aim for enhancing the credibility of river basin modelling. MoST prescribes interaction, which is a form of participation above consultation but below engagement of stakeholders and the public in the early phases of the modelling cycle and under review tasks throughout the process. MoST is a flexible tool which supports different types of users and facilitates interaction between modeller, manager and stakeholders. The perspective of using MoST for engagement of stakeholders e.g. higher level participation throughout the modelling process as part of integrated water resource management is evaluate

    Piecewise Latent Variables for Neural Variational Text Processing

    Full text link
    Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as variational autoencoders. The hope is that such models will learn to represent rich, multi-modal latent factors in real-world data, such as natural language text. However, current models often assume simplistic priors on the latent variables - such as the uni-modal Gaussian distribution - which are incapable of representing complex latent factors efficiently. To overcome this restriction, we propose the simple, but highly flexible, piecewise constant distribution. This distribution has the capacity to represent an exponential number of modes of a latent target distribution, while remaining mathematically tractable. Our results demonstrate that incorporating this new latent distribution into different models yields substantial improvements in natural language processing tasks such as document modeling and natural language generation for dialogue.Comment: 19 pages, 2 figures, 8 tables; EMNLP 201

    Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses

    Full text link
    Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem. Unfortunately, existing automatic evaluation metrics are biased and correlate very poorly with human judgements of response quality. Yet having an accurate automatic evaluation procedure is crucial for dialogue research, as it allows rapid prototyping and testing of new models with fewer expensive human evaluations. In response to this challenge, we formulate automatic dialogue evaluation as a learning problem. We present an evaluation model (ADEM) that learns to predict human-like scores to input responses, using a new dataset of human response scores. We show that the ADEM model's predictions correlate significantly, and at a level much higher than word-overlap metrics such as BLEU, with human judgements at both the utterance and system-level. We also show that ADEM can generalize to evaluating dialogue models unseen during training, an important step for automatic dialogue evaluation.Comment: ACL 201
    • …
    corecore