Search CORE

18,759 research outputs found

QuestionBank: creating a corpus of parse-annotated questions

Author: Cahill Aoife
Judge John
van Genabith Josef
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2006
Field of study

This paper describes the development of QuestionBank, a corpus of 4000 parse-annotated questions for (i) use in training parsers employed in QA, and (ii) evaluation of question parsing. We present a series of experiments to investigate the effectiveness of QuestionBank as both an exclusive and supplementary training resource for a state-of-the-art parser in parsing both question and non-question test sets. We introduce a new method for recovering empty nodes and their antecedents (capturing long distance dependencies) from parser output in CFG trees using LFG f-structure reentrancies. Our main findings are (i) using QuestionBank training data improves parser performance to 89.75% labelled bracketing f-score, an increase of almost 11% over the baseline; (ii) back-testing experiments on non-question data (Penn-II WSJ Section 23) shows that the retrained parser does not suffer a performance drop on non-question material; (iii) ablation experiments show that the size of training material provided by QuestionBank is sufficient to achieve optimal results; (iv) our method for recovering empty nodes captures long distance dependencies in questions from the ATIS corpus with high precision (96.82%) and low recall (39.38%). In summary, QuestionBank provides a useful new resource in parser-based QA research

CiteSeerX

Irish Universities

DCU Online Research Access Service

On the Complexity and Performance of Parsing with Derivatives

Author: Adams Michael D.
Hollenbeck Celeste
Might Matthew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/04/2016
Field of study

Current algorithms for context-free parsing inflict a trade-off between ease of understanding, ease of implementation, theoretical complexity, and practical performance. No algorithm achieves all of these properties simultaneously. Might et al. (2011) introduced parsing with derivatives, which handles arbitrary context-free grammars while being both easy to understand and simple to implement. Despite much initial enthusiasm and a multitude of independent implementations, its worst-case complexity has never been proven to be better than exponential. In fact, high-level arguments claiming it is fundamentally exponential have been advanced and even accepted as part of the folklore. Performance ended up being sluggish in practice, and this sluggishness was taken as informal evidence of exponentiality. In this paper, we reexamine the performance of parsing with derivatives. We have discovered that it is not exponential but, in fact, cubic. Moreover, simple (though perhaps not obvious) modifications to the implementation by Might et al. (2011) lead to an implementation that is not only easy to understand but also highly performant in practice.Comment: 13 pages; 12 figures; implementation at http://bitbucket.org/ucombinator/parsing-with-derivatives/ ; published in PLDI '16, Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 13 - 17, 2016, Santa Barbara, CA, US

arXiv.org e-Print Archive

Crossref

Interaction Grammars

Author: Bruno Guillaume
Bruno Guillaume
Guy Perrier
Guy Perrier
Thème Sym
Équipe-projet Calligramme
Publication venue
Publication date: 01/01/2008
Field of study

Interaction Grammar (IG) is a grammatical formalism based on the notion of polarity. Polarities express the resource sensitivity of natural languages by modelling the distinction between saturated and unsaturated syntactic structures. Syntactic composition is represented as a chemical reaction guided by the saturation of polarities. It is expressed in a model-theoretic framework where grammars are constraint systems using the notion of tree description and parsing appears as a process of building tree description models satisfying criteria of saturation and minimality

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

MUSE CSP: An Extension to the Constraint Satisfaction Problem

Author: Harper M. P.
Helzerman R. A
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes an extension to the constraint satisfaction problem (CSP) called MUSE CSP (MUltiply SEgmented Constraint Satisfaction Problem). This extension is especially useful for those problems which segment into multiple sets of partially shared variables. Such problems arise naturally in signal processing applications including computer vision, speech processing, and handwriting recognition. For these applications, it is often difficult to segment the data in only one way given the low-level information utilized by the segmentation algorithms. MUSE CSP can be used to compactly represent several similar instances of the constraint satisfaction problem. If multiple instances of a CSP have some common variables which have the same domains and constraints, then they can be combined into a single instance of a MUSE CSP, reducing the work required to apply the constraints. We introduce the concepts of MUSE node consistency, MUSE arc consistency, and MUSE path consistency. We then demonstrate how MUSE CSP can be used to compactly represent lexically ambiguous sentences and the multiple sentence hypotheses that are often generated by speech recognition algorithms so that grammar constraints can be used to provide parses for all syntactically correct sentences. Algorithms for MUSE arc and path consistency are provided. Finally, we discuss how to create a MUSE CSP from a set of CSPs which are labeled to indicate when the same variable is shared by more than a single CSP.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

From chunks to function-argument structure : a similarity-based approach

Author: Hinrichs Erhard
Kübler Sandra
Publication venue
Publication date: 01/01/2001
Field of study

Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning function-argument structure. The present paper offers a similaritybased algorithm for assigning functional labels such as subject, object, head, complement, etc. to complete syntactic structures on the basis of prechunked input. The evaluation of the algorithm has concentrated on measuring the quality of functional labels. It was performed on a German and an English treebank using two different annotation schemes at the level of function argument structure. The results of 89.73% correct functional labels for German and 90.40%for English validate the general approach

CiteSeerX

Crossref

Publikationsserver der Universität Tübingen

Hochschulschriftenserver - Universität Frankfurt am Main

DyLan : Parser for Dynamic Syntax

Author: Eshghi Arash
Hough Julian
Purver Matthew
Publication venue
Publication date: 30/12/2013
Field of study

Queen Mary Research Online