Search CORE

10,765 research outputs found

Learning to Embed Words in Context for Syntactic Tasks

Author: Gimpel Kevin
Livescu Karen
Tu Lifu
Publication venue
Publication date: 01/01/2017
Field of study

We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.Comment: Accepted by ACL 2017 Repl4NLP worksho

arXiv.org e-Print Archive

Crossref

Structural Embedding of Syntactic Trees for Machine Comprehension

Author: Hu Junjie
Liu Rui
Nyberg Eric
Wei Wei
Yang Zi
Publication venue
Publication date: 01/01/2017
Field of study

Deep neural networks for machine comprehension typically utilizes only word or character embeddings without explicitly taking advantage of structured linguistic information such as constituency trees and dependency trees. In this paper, we propose structural embedding of syntactic trees (SEST), an algorithm framework to utilize structured information and encode them into vector representations that can boost the performance of algorithms for the machine comprehension. We evaluate our approach using a state-of-the-art neural attention model on the SQuAD dataset. Experimental results demonstrate that our model can accurately identify the syntactic boundaries of the sentences and extract answers that are syntactically coherent over the baseline methods

arXiv.org e-Print Archive

Crossref

The Expressive Power of Word Embeddings

Author: Al-Rfou Rami
Chen Yanqing
Perozzi Bryan
Skiena Steven
Publication venue
Publication date: 29/05/2013
Field of study

We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.Comment: submitted to ICML 2013, Deep Learning for Audio, Speech and Language Processing Workshop. 8 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX