49 research outputs found
Generating CCG Categories
Previous CCG supertaggers usually predict categories using multi-class
classification. Despite their simplicity, internal structures of categories are
usually ignored. The rich semantics inside these structures may help us to
better handle relations among categories and bring more robustness into
existing supertaggers. In this work, we propose to generate categories rather
than classify them: each category is decomposed into a sequence of smaller
atomic tags, and the tagger aims to generate the correct sequence. We show that
with this finer view on categories, annotations of different categories could
be shared and interactions with sentence contexts could be enhanced. The
proposed category generator is able to achieve state-of-the-art tagging (95.5%
accuracy) and parsing (89.8% labeled F1) performances on the standard CCGBank.
Furthermore, its performances on infrequent (even unseen) categories,
out-of-domain texts and low resource language give promising results on
introducing generation models to the general CCG analyses.Comment: Accepted by AAAI 202
Improving a supervised CCG parser
The central topic of this thesis is the task of syntactic parsing with Combinatory Categorial Grammar (CCG). We focus on pipeline approaches that have allowed researchers to develop efficient and accurate parsers trained on articles taken from the Wall Street Journal (WSJ). We present three approaches to improving the state-of-the-art in CCG parsing. First, we test novel supertagger-parser combinations to identify the parsing models and algorithms that benefit the most from recent gains in supertagger accuracy. Second, we attempt to lessen the future burdens of assembling a state-of-the-art CCG parsing pipeline by showing that a part-of-speech (POS) tagger is not required to achieve optimal performance. Finally, we discuss the deficiencies of current parsing algorithms and propose a solution that promises improvements in accuracy – particularly for difficult dependencies – while preserving efficiency and optimality guarantees
Recommended from our members
Structured Learning with Inexact Search: Advances in Shift-Reduce CCG Parsing
Statistical shift-reduce parsing involves the interplay of representation learning, structured learning, and inexact search. This dissertation considers approaches that tightly integrate these three elements and explores three novel models for shift-reduce CCG parsing. First, I develop a dependency model, in which the selection of shift-reduce action sequences producing a dependency structure is treated as a hidden variable; the key components of the model are a dependency oracle and a learning algorithm that integrates the dependency oracle, the structured perceptron, and beam search. Second, I present expected F-measure training and show how to derive a globally normalized RNN model, in which beam search is naturally incorporated and used in conjunction with the
objective to learn shift-reduce action sequences optimized for the final evaluation metric. Finally, I describe an LSTM model that is able to construct parser state representations incrementally by following the shift-reduce syntactic derivation process; I show expected F-measure training, which is agnostic to the underlying neural network, can be applied in this setting to obtain globally normalized greedy and beam-search LSTM shift-reduce parsers.The Carnegie Trust for the Universities of Scotland;
The Cambridge Trus