Search CORE

73 research outputs found

Automatic Extraction of Subcategorization from Corpora

Author: Briscoe Ted
Carroll John
Publication venue
Publication date: 01/01/1997
Field of study

We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

Can Subcategorisation Probabilities Help a Statistical Parser?

Author: Briscoe Ted
Carroll John
Minnen Guido
Publication venue
Publication date: 01/01/1998
Field of study

Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st

arXiv.org e-Print Archive

CiteSeerX

Sussex Research Online

Frequency Estimation of Verb Subcategorization Frames Based on Syntactic and Multidimensional Statistical Analysis

Author: Evans David A.
Gibson Ted
Ushioda Akira
Waibel Alex
Publication venue: Association for Computational Linguistics
Publication date: 03/01/2024
Field of study

KITopen

The Automatic Acquisition of Frequencies of Verb Subcategorization Frames from Tagged Corpora

Author: Evans David A.
Gibson Ted
Ushioda Akira
Waibel Alex
Publication venue: Association for Computational Linguistics
Publication date: 03/01/2024
Field of study

KITopen

Re-estimation of Lexical Parameters for Treebank PCFGs

Author: Tejaswini Deoskar
Publication venue
Publication date: 01/01/2008
Field of study

We present procedures which pool lexical information estimated from unlabeled data via the Inside-Outside algorithm, with lexical information from a treebank PCFG. The procedures produce substantial improvements (up to 31.6 % error reduction) on the task of determining subcategorization frames of novel verbs, relative to a smoothed Penn Treebank-trained PCFG. Even with relatively small quantities of unlabeled training data, the re-estimated models show promising improvements in labeled bracketing f-scores on Wall Street Journal parsing, and substantial benefit in acquiring the subcategorization preferences of low-frequency verbs.

CiteSeerX

Crossref

Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Author: Briscoe Ted
Diana McCarthy
John Carroll
Li Hang
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2003
Field of study

Selectional preferences have been used by word sense disambiguation (WSD) systems as one source of disambiguating information. We evaluate WSD using selectional preferences acquired for English adjective—noun, subject, and direct object grammatical relationships with respect to a standard test corpus. The selectional preferences are specific to verb or adjective classes, rather than individual word forms, so they can be used to disambiguate the co-occurring adjectives and verbs, rather than just the nominal argument heads. We also investigate use of the one-senseper-discourse heuristic to propagate a sense tag for a word to other occurrences of the same word within the current document in order to increase coverage. Although the preferences perform well in comparison with other unsupervised WSD systems on the same corpus, the results show that for many applications, further knowledge sources would be required to achieve an adequate level of accuracy and coverage. In addition to quantifying performance, we analyze the results to investigate the situations in which the selectional preferences achieve the best precision and in which the one-sense-per-discourse heuristic increases performance

CiteSeerX

Crossref

Sussex Research Online

Enforcing Subcategorization Constraints in a Parser Using Sub-parses Recombining

Author: Mirroshandel Seyed Abolghasem
Nasr Alexis
Sagot Benoît
Publication venue: HAL CCSD
Publication date: 09/06/2013
Field of study

International audienceTreebanks are not large enough to adequately model subcategorization frames of predicative lexemes, which is an important source of lexico-syntactic constraints for parsing. As a consequence, parsers trained on such treebanks usually make mistakes when selecting the arguments of predicative lexemes. In this paper, we propose an original way to correct subcategorization errors by combining sub-parses of a sentence S that appear in the list of the n-best parses of S. The subcategorization information comes from three different resources, the first one is extracted from a treebank, the second one is computed on a large corpora and the third one is an existing syntactic lexicon. Experiments on the French Treebank showed a 15.24% reduction of erroneous subcategorization frames (SF) selections for verbs as well as a relative decrease of the error rate of 4% Labeled Accuracy Score on the state of the art parser on this treebank

HAL AMU

INRIA a CCSD electronic archive server

Hal-Diderot

Argumentness and Probabilistic Case Structures

Author: Lee Ik-Hwan
Yang Dan-Hee
Publication venue: The Korean Society for Language and Information
Publication date: 01/01/2002
Field of study

Waseda University Repository