4,251 research outputs found
Automatic Extraction of Subcategorization from Corpora
We describe a novel technique and implemented system for constructing a
subcategorization dictionary from textual corpora. Each dictionary entry
encodes the relative frequency of occurrence of a comprehensive set of
subcategorization classes for English. An initial experiment, on a sample of 14
verbs which exhibit multiple complementation patterns, demonstrates that the
technique achieves accuracy comparable to previous approaches, which are all
limited to a highly restricted set of subcategorization classes. We also
demonstrate that a subcategorization dictionary built with the system improves
the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
Question Generation (QG) is fundamentally a simple syntactic transformation;
however, many aspects of semantics influence what questions are good to form.
We implement this observation by developing Syn-QG, a set of transparent
syntactic rules leveraging universal dependencies, shallow semantic parsing,
lexical resources, and custom rules which transform declarative sentences into
question-answer pairs. We utilize PropBank argument descriptions and VerbNet
state predicates to incorporate shallow semantic content, which helps generate
questions of a descriptive nature and produce inferential and semantically
richer questions than existing systems. In order to improve syntactic fluency
and eliminate grammatically incorrect questions, we employ back-translation
over the output of these syntactic rules. A set of crowd-sourced evaluations
shows that our system can generate a larger number of highly grammatical and
relevant questions than previous QG systems and that back-translation
drastically improves grammaticality at a slight cost of generating irrelevant
questions.Comment: Some of the results in the paper were incorrec
An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process
Although Minimum Distance Parsing (MDP) offers a theoretically attractive
solution to the problem of extragrammaticality, it is often computationally
infeasible in large scale practical applications. In this paper we present an
alternative approach where the labor is distributed between a more restrictive
partial parser and a repair module. Though two stage approaches have grown in
popularity in recent years because of their efficiency, they have done so at
the cost of requiring hand coded repair heuristics. In contrast, our two stage
approach does not require any hand coded knowledge sources dedicated to repair,
thus making it possible to achieve a similar run time advantage over MDP
without losing the quality of domain independence.Comment: 9 pages, 1 Postscript figure, uses aclap.sty and psfig.tex, In
Proceedings of EMNLP 199
Parallel Distributed Grammar Engineering for Practical Applications
Based on a detailed case study of parallel grammar development distributed across two sites, we review some of the requirements for regression testing in grammar engineering, summarize our approach to systematic competence and performance profiling, and discuss our experience with grammar development for a commercial application. If possible, the workshop presentation will be organized around a software demonstration
- ā¦