6,918 research outputs found
A Corpus-Based Approach for Building Semantic Lexicons
Semantic knowledge can be a great asset to natural language processing
systems, but it is usually hand-coded for each application. Although some
semantic information is available in general-purpose knowledge bases such as
WordNet and Cyc, many applications require domain-specific lexicons that
represent words and categories for a particular topic. In this paper, we
present a corpus-based method that can be used to build semantic lexicons for
specific categories. The input to the system is a small set of seed words for a
category and a representative text corpus. The output is a ranked list of words
that are associated with the category. A user then reviews the top-ranked words
and decides which ones should be entered in the semantic lexicon. In
experiments with five categories, users typically found about 60 words per
category in 10-15 minutes to build a core semantic lexicon.Comment: 8 pages - to appear in Proceedings of EMNLP-
SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment
Because word semantics can substantially change across communities and
contexts, capturing domain-specific word semantics is an important challenge.
Here, we propose SEMAXIS, a simple yet powerful framework to characterize word
semantics using many semantic axes in word- vector spaces beyond sentiment. We
demonstrate that SEMAXIS can capture nuanced semantic representations in
multiple online communities. We also show that, when the sentiment axis is
examined, SEMAXIS outperforms the state-of-the-art approaches in building
domain-specific sentiment lexicons.Comment: Accepted in ACL 2018 as a full pape
Acquiring Word-Meaning Mappings for Natural Language Interfaces
This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted
Examples), that acquires a semantic lexicon from a corpus of sentences paired
with semantic representations. The lexicon learned consists of phrases paired
with meaning representations. WOLFIE is part of an integrated system that
learns to transform sentences into representations such as logical database
queries. Experimental results are presented demonstrating WOLFIE's ability to
learn useful lexicons for a database interface in four different natural
languages. The usefulness of the lexicons learned by WOLFIE are compared to
those acquired by a similar system, with results favorable to WOLFIE. A second
set of experiments demonstrates WOLFIE's ability to scale to larger and more
difficult, albeit artificially generated, corpora. In natural language
acquisition, it is difficult to gather the annotated data needed for supervised
learning; however, unannotated data is fairly plentiful. Active learning
methods attempt to select for annotation and training only the most informative
examples, and therefore are potentially very useful in natural language
applications. However, most results to date for active learning have only
considered standard classification tasks. To reduce annotation effort while
maintaining accuracy, we apply active learning to semantic lexicons. We show
that active learning can significantly reduce the number of annotated examples
required to achieve a given level of performance
- …