39 research outputs found
A Corpus and Model Integrating Multiword Expressions and Supersenses
Abstract This paper introduces a task of identifying and semantically classifying lexical expressions in running text. We investigate the online reviews genre, adding semantic supersense annotations to a 55,000 word English corpus that was previously annotated for multiword expressions. The noun and verb supersenses apply to full lexical expressions, whether single-or multiword. We then present a sequence tagging model that jointly infers lexical expressions and their supersenses. Results show that even with our relatively small training corpus in a noisy domain, the joint task can be performed to attain 70% class labeling F 1
Lexical Semantic Recognition
In lexical semantics, full-sentence segmentation and segment labeling of
various phenomena are generally treated separately, despite their
interdependence. We hypothesize that a unified lexical semantic recognition
task is an effective way to encapsulate previously disparate styles of
annotation, including multiword expression identification / classification and
supersense tagging. Using the STREUSLE corpus, we train a neural CRF sequence
tagger and evaluate its performance along various axes of annotation. As the
label set generalizes that of previous tasks (PARSEME, DiMSUM), we additionally
evaluate how well the model generalizes to those test sets, finding that it
approaches or surpasses existing models despite training only on STREUSLE. Our
work also establishes baseline models and evaluation metrics for integrated and
accurate modeling of lexical semantics, facilitating future work in this area.Comment: 11 pages, 3 figures; to appear at MWE 202
Multi-Task Learning of Keyphrase Boundary Classification
Keyphrase boundary classification (KBC) is the task of detecting keyphrases
in scientific articles and labelling them with respect to predefined types.
Although important in practice, this task is so far underexplored, partly due
to the lack of labelled data. To overcome this, we explore several auxiliary
tasks, including semantic super-sense tagging and identification of multi-word
expressions, and cast the task as a multi-task learning problem with deep
recurrent neural networks. Our multi-task models perform significantly better
than previous state of the art approaches on two scientific KBC datasets,
particularly for long keyphrases.Comment: ACL 201