Search CORE

4,673 research outputs found

Story Cloze Ending Selection Baselines and Data Examination

Author: Frank Anette
Mihaylov Todor
Publication venue
Publication date: 01/01/2017
Field of study

This paper describes two supervised baseline systems for the Story Cloze Test Shared Task (Mostafazadeh et al., 2016a). We first build a classifier using features based on word embeddings and semantic similarity computation. We further implement a neural LSTM system with different encoding strategies that try to model the relation between the story and the provided endings. Our experiments show that a model using representation features based on average word embedding vectors over the given story words and the candidate ending sentences words, joint with similarity features between the story and candidate ending representations performed better than the neural models. Our best model achieves an accuracy of 72.42, ranking 3rd in the official evaluation.Comment: Submission for the LSDSem 2017 - Linking Models of Lexical, Sentential and Discourse-level Semantics - Shared Tas

arXiv.org e-Print Archive

TUbiblio

Crossref

Semantic-aware blocking for entity resolution

Author: Cui Mingyuan
Liang Huizhi
Wang Qing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/08/2015
Field of study

In this paper, we propose a semantic-aware blocking framework for entity resolution (ER). The proposed framework is built using locality-sensitive hashing (LSH) techniques, which efficiently unifies both textual and semantic features into an ER blocking process. In order to understand how similarity metrics may affect the effectiveness of ER blocking, we study the robustness of similarity metrics and their properties in terms of LSH families. Then, we present how the semantic similarity of records can be captured, measured, and integrated with LSH techniques over multiple similarity spaces. In doing so, the proposed framework can support efficient similarity searches on records in both textual and semantic similarity spaces, yielding ER blocking with improved quality. We have evaluated the proposed framework over two real-world data sets, and compared it with the state-of-the-art blocking techniques. Our experimental study shows that the combination of semantic similarity and textual similarity can considerably improve the quality of blocking. Furthermore, due to the probabilistic nature of LSH, this semantic-aware blocking framework enables us to build fast and reliable blocking for performing entity resolution tasks in a large-scale data environment

Central Archive at the University of Reading

Crossref

The Australian National University

Sigmoid similarity - a new feature-based similarity measure

Author: Cena F.
Likavec S.
Lombardi I.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin

Automatic Generation of Grounded Visual Questions

Author: Qu Lizhen
Yang Zhenglu
You Shaodi
Zhang Jiawan
Zhang Shijie
Publication venue
Publication date: 29/05/2017
Field of study

In this paper, we propose the first model to be able to generate visually grounded questions with diverse types for a single image. Visual question generation is an emerging topic which aims to ask questions in natural language based on visual input. To the best of our knowledge, it lacks automatic methods to generate meaningful questions with various types for the same visual input. To circumvent the problem, we propose a model that automatically generates visually grounded questions with varying types. Our model takes as input both images and the captions generated by a dense caption model, samples the most probable question types, and generates the questions in sequel. The experimental results on two real world datasets show that our model outperforms the strongest baseline in terms of both correctness and diversity with a wide margin.Comment: VQ

arXiv.org e-Print Archive

Crossref