3,777 research outputs found
A hybrid architecture for robust parsing of german
This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation
Keystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.Comment: In COLING 201
Semantic Tagging with Deep Residual Networks
We propose a novel semantic tagging task, sem-tagging, tailored for the
purpose of multilingual semantic parsing, and present the first tagger using
deep residual networks (ResNets). Our tagger uses both word and character
representations and includes a novel residual bypass architecture. We evaluate
the tagset both intrinsically on the new task of semantic tagging, as well as
on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an
auxiliary loss function predicting our semantic tags, significantly outperforms
prior results on English Universal Dependencies POS tagging (95.71% accuracy on
UD v1.2 and 95.67% accuracy on UD v1.3).Comment: COLING 2016, camera ready versio
Chunking clinical text containing non-canonical language
Free text notes typed by primary care physicians during patient consultations typically contain highly non-canonical language. Shallow syntactic analysis of free text notes can help to reveal valuable information for the study of disease and treatment. We present an exploratory study into chunking such text using off-the-shelf language processing tools and pre-trained statistical models. We evaluate chunking accuracy with respect to part-of-speech tagging quality, choice of chunk representation, and breadth of context features. Our results indicate that narrow context feature windows give the best results, but that chunk representation and minor differences in tagging quality do not have a significant impact on chunking accuracy
Chunk Tagger - Statistical Recognition of Noun Phrases
We describe a stochastic approach to partial parsing, i.e., the recognition
of syntactic structures of limited depth. The technique utilises Markov Models,
but goes beyond usual bracketing approaches, since it is capable of recognising
not only the boundaries, but also the internal structure and syntactic category
of simple as well as complex NP's, PP's, AP's and adverbials. We compare
tagging accuracy for different applications and encoding schemes.Comment: 7 pages, LaTe
- âŚ