1,553 research outputs found
From Frequency to Meaning: Vector Space Models of Semantics
Computers understand very little of the meaning of human language. This
profoundly limits our ability to give instructions to computers, the ability of
computers to explain their actions to us, and the ability of computers to
analyse and process text. Vector space models (VSMs) of semantics are beginning
to address these limits. This paper surveys the use of VSMs for semantic
processing of text. We organize the literature on VSMs according to the
structure of the matrix in a VSM. There are currently three broad classes of
VSMs, based on term-document, word-context, and pair-pattern matrices, yielding
three classes of applications. We survey a broad range of applications in these
three categories and we take a detailed look at a specific open source project
in each category. Our goal in this survey is to show the breadth of
applications of VSMs for semantics, to provide a new perspective on VSMs for
those who are already familiar with the area, and to provide pointers into the
literature for those who are less familiar with the field
SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)
This paper describes the second edition of the shared task on Taxonomy Extraction Evaluation organised as part of SemEval 2016. This task aims to extract hypernym-hyponym relations between a given list of domain-specific terms and then to construct a domain taxonomy based on them. TExEval-2 introduced a multilingual setting for this task, covering four different languages including English, Dutch, Italian and French from domains as diverse as environment, food and science. A total of
62 runs submitted by 5 different teams were
evaluated using structural measures, by comparison with gold standard taxonomies and by manual quality assessment of novel relations.Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 (INSIGHT
Visually Grounded Meaning Representations
In this paper we address the problem of grounding distributional representations of lexical meaning. We introduce a new
model which uses stacked autoencoders to learn higher-level representations from textual and visual input. The visual modality is
encoded via vectors of attributes obtained automatically from images. We create a new large-scale taxonomy of 600 visual attributes
representing more than 500 concepts and 700K images. We use this dataset to train attribute classifiers and integrate their predictions
with text-based distributional models of word meaning. We evaluate our model on its ability to simulate word similarity judgments and
concept categorization. On both tasks, our model yields a better fit to behavioral data compared to baselines and related models which
either rely on a single modality or do not make use of attribute-based input
- …