2,164 research outputs found
Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation
Existing approaches to automatic VerbNet-style verb classification are
heavily dependent on feature engineering and therefore limited to languages
with mature NLP pipelines. In this work, we propose a novel cross-lingual
transfer method for inducing VerbNets for multiple languages. To the best of
our knowledge, this is the first study which demonstrates how the architectures
for learning word embeddings can be applied to this challenging
syntactic-semantic task. Our method uses cross-lingual translation pairs to tie
each of the six target languages into a bilingual vector space with English,
jointly specialising the representations to encode the relational information
from English VerbNet. A standard clustering algorithm is then run on top of the
VerbNet-specialised representations, using vector dimensions as features for
learning verb classes. Our results show that the proposed cross-lingual
transfer approach sets new state-of-the-art verb classification performance
across all six target languages explored in this work.Comment: EMNLP 2017 (long paper
Lexicon Infused Phrase Embeddings for Named Entity Resolution
Most state-of-the-art approaches for named-entity recognition (NER) use semi
supervised information in the form of word clusters and lexicons. Recently
neural network-based language models have been explored, as they as a byproduct
generate highly informative vector representations for words, known as word
embeddings. In this paper we present two contributions: a new form of learning
word embeddings that can leverage information from relevant lexicons to improve
the representations, and the first system to use neural word embeddings to
achieve state-of-the-art results on named-entity recognition in both CoNLL and
Ontonotes NER. Our system achieves an F1 score of 90.90 on the test set for
CoNLL 2003---significantly better than any previous system trained on public
data, and matching a system employing massive private industrial query-log
data.Comment: Accepted in CoNLL 201
- …