Search CORE

2,667 research outputs found

Message-Passing Protocols for Real-World Parsing -- An Object-Oriented Model and its Preliminary Evaluation

Author: Broeker Norbert
Hahn Udo
Neuhaus Peter
Publication venue
Publication date: 01/01/1997
Field of study

We argue for a performance-based design of natural language grammars and their associated parsers in order to meet the constraints imposed by real-world NLP. Our approach incorporates declarative and procedural knowledge about language and language use within an object-oriented specification framework. We discuss several message-passing protocols for parsing and provide reasons for sacrificing completeness of the parse in favor of efficiency based on a preliminary empirical evaluation.Comment: 12 pages, uses epsfig.st

arXiv.org e-Print Archive

CiteSeerX

CHR as grammar formalism. A first report

Author: Christiansen Henning
Publication venue
Publication date: 01/01/2001
Field of study

Grammars written as Constraint Handling Rules (CHR) can be executed as efficient and robust bottom-up parsers that provide a straightforward, non-backtracking treatment of ambiguity. Abduction with integrity constraints as well as other dynamic hypothesis generation techniques fit naturally into such grammars and are exemplified for anaphora resolution, coordination and text interpretation.Comment: 12 pages. Presented at ERCIM Workshop on Constraints, Prague, Czech Republic, June 18-20, 200

arXiv.org e-Print Archive

Roskilde Universitet

Automatic acquisition of Spanish LFG resources from the Cast3LB treebank

Author: Cahill Aoife
O'Donovan Ruth
van Genabith Josef
Way Andy
Publication venue: CSLI Publications
Publication date: 01/01/2005
Field of study

In this paper, we describe the automatic annotation of the Cast3LB Treebank with LFG f-structures for the subsequent extraction of Spanish probabilistic grammar and lexical resources. We adapt the approach and methodology of Cahill et al. (2004), O’Donovan et al. (2004) and elsewhere for English to Spanish and the Cast3LB treebank encoding. We report on the quality and coverage of the automatic f-structure annotation. Following the pipeline and integrated models of Cahill et al. (2004), we extract wide-coverage probabilistic LFG approximations and parse unseen Spanish text into f-structures. We also extend Bikel’s (2002) Multilingual Parse Engine to include a Spanish language module. Using the retrained Bikel parser in the pipeline model gives the best results against a manually constructed gold standard (73.20% predsonly f-score). We also extract Spanish lexical resources: 4090 semantic form types with 98 frame types. Subcategorised prepositions and particles are included in the frames

Irish Universities

DCU Online Research Access Service

Automatic acquisition of LFG resources for German - as good as it gets

Author: Rehbein Ines
van Genabith Josef
Publication venue: CSLI Publications
Publication date: 01/01/2009
Field of study

We present data-driven methods for the acquisition of LFG resources from two German treebanks. We discuss problems specific to semi-free word order languages as well as problems arising fromthe data structures determined by the design of the different treebanks. We compare two ways of encoding semi-free word order, as done in the two German treebanks, and argue that the design of the TiGer treebank is more adequate for the acquisition of LFG resources. Furthermore, we describe an architecture for LFG grammar acquisition for German, based on the two German treebanks, and compare our results with a hand-crafted German LFG grammar

CiteSeerX

Irish Universities

DCU Online Research Access Service

Concurrent Lexicalized Dependency Parsing: The ParseTalk Model

Author: Broeker Norbert
Hahn Udo
Schacht Susanne
Publication venue
Publication date: 01/01/1994
Field of study

A grammar model for concurrent, object-oriented natural language parsing is introduced. Complete lexical distribution of grammatical knowledge is achieved building upon the head-oriented notions of valency and dependency, while inheritance mechanisms are used to capture lexical generalizations. The underlying concurrent computation model relies upon the actor paradigm. We consider message passing protocols for establishing dependency relations and ambiguity handling.Comment: 90kB, 7pages Postscrip

arXiv.org e-Print Archive

CiteSeerX

Interaction Grammars

Author: Bruno Guillaume
Bruno Guillaume
Guy Perrier
Guy Perrier
Thème Sym
Équipe-projet Calligramme
Publication venue
Publication date: 01/01/2008
Field of study

Interaction Grammar (IG) is a grammatical formalism based on the notion of polarity. Polarities express the resource sensitivity of natural languages by modelling the distinction between saturated and unsaturated syntactic structures. Syntactic composition is represented as a chemical reaction guided by the saturation of polarities. It is expressed in a model-theoretic framework where grammars are constraint systems using the notion of tree description and parsing appears as a process of building tree description models satisfying criteria of saturation and minimality

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Three New Probabilistic Models for Dependency Parsing: An Exploration

Author: Eisner Jason
Publication venue
Publication date: 01/01/1997
Field of study

After presenting a novel O(n^3) parsing algorithm for dependency grammar, we develop three contrasting ways to stochasticize it. We propose (a) a lexical affinity model where words struggle to modify each other, (b) a sense tagging model where words fluctuate randomly in their selectional preferences, and (c) a generative model where the speaker fleshes out each word's syntactic and conceptual structure without regard to the implications for the hearer. We also give preliminary empirical results from evaluating the three models' parsing performance on annotated Wall Street Journal training text (derived from the Penn Treebank). In these results, the generative (i.e., top-down) model performs significantly better than the others, and does about equally well at assigning part-of-speech tags.Comment: 6 pages, LaTeX 2.09 packaged with 4 .eps files, also uses colap.sty and acl.bs

arXiv.org e-Print Archive

CiteSeerX