Search CORE

1,473 research outputs found

Automatic Extraction of Subcategorization from Corpora

Author: Briscoe Ted
Carroll John
Publication venue
Publication date: 01/01/1997
Field of study

We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

Constraint Logic Programming for Natural Language Processing

Author: Blache Philippe
Hathout Nabil
Publication venue
Publication date: 01/01/1995
Field of study

This paper proposes an evaluation of the adequacy of the constraint logic programming paradigm for natural language processing. Theoretical aspects of this question have been discussed in several works. We adopt here a pragmatic point of view and our argumentation relies on concrete solutions. Using actual contraints (in the CLP sense) is neither easy nor direct. However, CLP can improve parsing techniques in several aspects such as concision, control, efficiency or direct representation of linguistic formalism. This discussion is illustrated by several examples and the presentation of an HPSG parser.Comment: 15 pages, uuencoded and compressed postscript to appear in Proceedings of the 5th Int. Workshop on Natural Language Understanding and Logic Programming. Lisbon, Portugal. 199

arXiv.org e-Print Archive

CiteSeerX

Optimality Theory as a Framework for Lexical Acquisition

Author: A. Arun
A. Prince
B. Aarts
C. Fabre
J. McCarthy
M. Butt
M.R. Brent
N. Chomsky
R. Kager
Publication venue
Publication date: 01/01/2014
Field of study

This paper re-investigates a lexical acquisition system initially developed for French.We show that, interestingly, the architecture of the system reproduces and implements the main components of Optimality Theory. However, we formulate the hypothesis that some of its limitations are mainly due to a poor representation of the constraints used. Finally, we show how a better representation of the constraints used would yield better results

arXiv.org e-Print Archive

Crossref

Re-estimation of Lexical Parameters for Treebank PCFGs

Author: Tejaswini Deoskar
Publication venue
Publication date: 01/01/2008
Field of study

We present procedures which pool lexical information estimated from unlabeled data via the Inside-Outside algorithm, with lexical information from a treebank PCFG. The procedures produce substantial improvements (up to 31.6 % error reduction) on the task of determining subcategorization frames of novel verbs, relative to a smoothed Penn Treebank-trained PCFG. Even with relatively small quantities of unlabeled training data, the re-estimated models show promising improvements in labeled bracketing f-scores on Wall Street Journal parsing, and substantial benefit in acquiring the subcategorization preferences of low-frequency verbs.

CiteSeerX

Crossref

Can Subcategorisation Probabilities Help a Statistical Parser?

Author: Briscoe Ted
Carroll John
Minnen Guido
Publication venue
Publication date: 01/01/1998
Field of study

Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st

arXiv.org e-Print Archive

CiteSeerX

Sussex Research Online

Acquiring and processing verb argument structure : distributional learning in a miniature language

Author: Altmann
Ambridge
Aslin
Baker
Bowerman
Bowerman
Braine
Braine
Braine
Bresnan
Brooks
Brooks
Brooks
Brown
Bybee
Casenhiser
Chomsky
Clifton
Cooper
Elissa L. Newport
Elizabeth Wonnacott
Ervin
Feldman
Ferreira
Fisher
Fisher
Fisher
Fisher
Frazier
Frazier
Frazier
Garnsey
Gerken
Gertner
Gillette
Givon
Gleitman
Gleitman
Goldberg
Goldberg
Goldberg
Goldberg
Goldin-Meadow
Gomez
Green
Grimshaw
Gropen
Gropen
Gropen
Hallett
Hare
Hare
Hudson Kam
Jackendoff
Johnson
Juliano
Juliano
Kemp
Kennison
Kim
Leiven
Levin
Lidz
Magnuson
Markman
Matthews
Michael K. Tanenhaus
Mintz
Mintz
Mitchell
Moeser
Morgan
Morgan
Naigles
Newport
Newport
Pinker
Pinker
Saffran
Saffran
Singleton
Snedeker
Tabor
Tanenhaus
Tanenhaus
Theakston
Thompson
Tomasello
Tomasello
Tomasello
Tomasello
Trueswell
Trueswell
Trueswell
Wonnacott
Publication venue: 'Elsevier BV'
Publication date: 01/05/2008
Field of study

Adult knowledge of a language involves correctly balancing lexically-based and more language-general patterns. For example, verb argument structures may sometimes readily generalize to new verbs, yet with particular verbs may resist generalization. From the perspective of acquisition, this creates significant learnability problems, with some researchers claiming a crucial role for verb semantics in the determination of when generalization may and may not occur. Similarly, there has been debate regarding how verb-specific and more generalized constraints interact in sentence processing and on the role of semantics in this process. The current work explores these issues using artificial language learning. In three experiments using languages without semantic cues to verb distribution, we demonstrate that learners can acquire both verb-specific and verb-general patterns, based on distributional information in the linguistic input regarding each of the verbs as well as across the language as a whole. As with natural languages, these factors are shown to affect production, judgments and real-time processing. We demonstrate that learners apply a rational procedure in determining their usage of these different input statistics and conclude by suggesting that a Bayesian perspective on statistical learning may be an appropriate framework for capturing our findings

Crossref

PubMed Central

Warwick Research Archives Portal Repository