Search CORE

3,354 research outputs found

Better Word Embeddings by Disentangling Contextual n-Gram Information

Author: Gupta Prakhar
Jaggi Martin
Pagliardini Matteo
Publication venue
Publication date: 01/01/2019
Field of study

Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the unigrams, resulting in better stand-alone word embeddings. We empirically show the validity of our hypothesis by outperforming other competing word representation models by a significant margin on a wide variety of tasks. We make our models publicly available.Comment: NAACL 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Non-distributional Word Vector Representations

Author: Dyer Chris
Faruqui Manaal
Publication venue
Publication date: 01/01/2015
Field of study

Data-driven representation learning for words is a technique of central importance in NLP. While indisputably useful as a source of features in downstream tasks, such vectors tend to consist of uninterpretable components whose relationship to the categories of traditional lexical semantic theories is tenuous at best. We present a method for constructing interpretable word vectors from hand-crafted linguistic resources like WordNet, FrameNet etc. These vectors are binary (i.e, contain only 0 and 1) and are 99.9% sparse. We analyze their performance on state-of-the-art evaluation methods for distributional models of word vectors and find they are competitive to standard distributional approaches.Comment: Proceedings of ACL 201

arXiv.org e-Print Archive

Crossref

Natural Notation for the Domestic Internet of Things

Author: A Blackwell
AF Blackwell
AK Dey
BA Myers
EW Dijkstra
G Lucci
JA Rode
JR Searle
SR Petrick
Publication venue
Publication date: 06/03/2015
Field of study

This study explores the use of natural language to give instructions that might be interpreted by Internet of Things (IoT) devices in a domestic `smart home' environment. We start from the proposition that reminders can be considered as a type of end-user programming, in which the executed actions might be performed either by an automated agent or by the author of the reminder. We conducted an experiment in which people wrote sticky notes specifying future actions in their home. In different conditions, these notes were addressed to themselves, to others, or to a computer agent.We analyse the linguistic features and strategies that are used to achieve these tasks, including the use of graphical resources as an informal visual language. The findings provide a basis for design guidance related to end-user development for the Internet of Things.Comment: Proceedings of the 5th International symposium on End-User Development (IS-EUD), Madrid, Spain, May, 201

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

The Australian National University