10,765 research outputs found
Learning to Embed Words in Context for Syntactic Tasks
We present models for embedding words in the context of surrounding words.
Such models, which we refer to as token embeddings, represent the
characteristics of a word that are specific to a given context, such as word
sense, syntactic category, and semantic role. We explore simple, efficient
token embedding models based on standard neural network architectures. We learn
token embeddings on a large amount of unannotated text and evaluate them as
features for part-of-speech taggers and dependency parsers trained on much
smaller amounts of annotated data. We find that predictors endowed with token
embeddings consistently outperform baseline predictors across a range of
context window and training set sizes.Comment: Accepted by ACL 2017 Repl4NLP worksho
Structural Embedding of Syntactic Trees for Machine Comprehension
Deep neural networks for machine comprehension typically utilizes only word
or character embeddings without explicitly taking advantage of structured
linguistic information such as constituency trees and dependency trees. In this
paper, we propose structural embedding of syntactic trees (SEST), an algorithm
framework to utilize structured information and encode them into vector
representations that can boost the performance of algorithms for the machine
comprehension. We evaluate our approach using a state-of-the-art neural
attention model on the SQuAD dataset. Experimental results demonstrate that our
model can accurately identify the syntactic boundaries of the sentences and
extract answers that are syntactically coherent over the baseline methods
The Expressive Power of Word Embeddings
We seek to better understand the difference in quality of the several
publicly released embeddings. We propose several tasks that help to distinguish
the characteristics of different embeddings. Our evaluation of sentiment
polarity and synonym/antonym relations shows that embeddings are able to
capture surprisingly nuanced semantics even in the absence of sentence
structure. Moreover, benchmarking the embeddings shows great variance in
quality and characteristics of the semantics captured by the tested embeddings.
Finally, we show the impact of varying the number of dimensions and the
resolution of each dimension on the effective useful features captured by the
embedding space. Our contributions highlight the importance of embeddings for
NLP tasks and the effect of their quality on the final results.Comment: submitted to ICML 2013, Deep Learning for Audio, Speech and Language
Processing Workshop. 8 pages, 8 figure
- …