23,602 research outputs found
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation
Existing approaches to automatic VerbNet-style verb classification are
heavily dependent on feature engineering and therefore limited to languages
with mature NLP pipelines. In this work, we propose a novel cross-lingual
transfer method for inducing VerbNets for multiple languages. To the best of
our knowledge, this is the first study which demonstrates how the architectures
for learning word embeddings can be applied to this challenging
syntactic-semantic task. Our method uses cross-lingual translation pairs to tie
each of the six target languages into a bilingual vector space with English,
jointly specialising the representations to encode the relational information
from English VerbNet. A standard clustering algorithm is then run on top of the
VerbNet-specialised representations, using vector dimensions as features for
learning verb classes. Our results show that the proposed cross-lingual
transfer approach sets new state-of-the-art verb classification performance
across all six target languages explored in this work.Comment: EMNLP 2017 (long paper
Symbolic inductive bias for visually grounded learning of spoken language
A widespread approach to processing spoken language is to first automatically
transcribe it into text. An alternative is to use an end-to-end approach:
recent works have proposed to learn semantic embeddings of spoken language from
images with spoken captions, without an intermediate transcription step. We
propose to use multitask learning to exploit existing transcribed speech within
the end-to-end setting. We describe a three-task architecture which combines
the objectives of matching spoken captions with corresponding images, speech
with text, and text with images. We show that the addition of the speech/text
task leads to substantial performance improvements on image retrieval when
compared to training the speech/image task in isolation. We conjecture that
this is due to a strong inductive bias transcribed speech provides to the
model, and offer supporting evidence for this.Comment: ACL 201
The System Kato: Detecting Cases of Plagiarism for Answer-Set Programs
Plagiarism detection is a growing need among educational institutions and
solutions for different purposes exist. An important field in this direction is
detecting cases of source-code plagiarism. In this paper, we present the tool
Kato for supporting the detection of this kind of plagiarism in the area of
answer-set programming (ASP). Currently, the tool is implemented for DLV
programs but it is designed to handle other logic-programming dialects as well.
We review the basic features of Kato, introduce its theoretical underpinnings,
and discuss an application of Kato for plagiarism detection in the context of
courses on logic programming at the Vienna University of Technology
Semantics, Modelling, and the Problem of Representation of Meaning -- a Brief Survey of Recent Literature
Over the past 50 years many have debated what representation should be used
to capture the meaning of natural language utterances. Recently new needs of
such representations have been raised in research. Here I survey some of the
interesting representations suggested to answer for these new needs.Comment: 15 pages, no figure
- …