84 research outputs found
Keystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.Comment: In COLING 201
AMR Dependency Parsing with a Typed Semantic Algebra
We present a semantic parser for Abstract Meaning Representations which
learns to parse strings into tree representations of the compositional
structure of an AMR graph. This allows us to use standard neural techniques for
supertagging and dependency tree parsing, constrained by a linguistically
principled type system. We present two approximative decoding algorithms, which
achieve state-of-the-art accuracy and outperform strong baselines.Comment: This paper will be presented at ACL 2018 (see
https://acl2018.org/programme/papers/
A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models
Beam search is a desirable choice of test-time decoding algorithm for neural
sequence models because it potentially avoids search errors made by simpler
greedy methods. However, typical cross entropy training procedures for these
models do not directly consider the behaviour of the final decoding method. As
a result, for cross-entropy trained models, beam decoding can sometimes yield
reduced test performance when compared with greedy decoding. In order to train
models that can more effectively make use of beam search, we propose a new
training procedure that focuses on the final loss metric (e.g. Hamming loss)
evaluated on the output of beam search. While well-defined, this "direct loss"
objective is itself discontinuous and thus difficult to optimize. Hence, in our
approach, we form a sub-differentiable surrogate objective by introducing a
novel continuous approximation of the beam search decoding procedure. In
experiments, we show that optimizing this new training objective yields
substantially better results on two sequence tasks (Named Entity Recognition
and CCG Supertagging) when compared with both cross entropy trained greedy
decoding and cross entropy trained beam decoding baselines.Comment: Updated for clarity and notational consistenc
Generating CCG Categories
Previous CCG supertaggers usually predict categories using multi-class
classification. Despite their simplicity, internal structures of categories are
usually ignored. The rich semantics inside these structures may help us to
better handle relations among categories and bring more robustness into
existing supertaggers. In this work, we propose to generate categories rather
than classify them: each category is decomposed into a sequence of smaller
atomic tags, and the tagger aims to generate the correct sequence. We show that
with this finer view on categories, annotations of different categories could
be shared and interactions with sentence contexts could be enhanced. The
proposed category generator is able to achieve state-of-the-art tagging (95.5%
accuracy) and parsing (89.8% labeled F1) performances on the standard CCGBank.
Furthermore, its performances on infrequent (even unseen) categories,
out-of-domain texts and low resource language give promising results on
introducing generation models to the general CCG analyses.Comment: Accepted by AAAI 202
A General-Purpose Tagger with Convolutional Neural Networks
We present a general-purpose tagger based on convolutional neural networks
(CNN), used for both composing word vectors and encoding context information.
The CNN tagger is robust across different tagging tasks: without task-specific
tuning of hyper-parameters, it achieves state-of-the-art results in
part-of-speech tagging, morphological tagging and supertagging. The CNN tagger
is also robust against the out-of-vocabulary problem, it performs well on
artificially unnormalized texts
- …