Search CORE

2,865 research outputs found

Developing a TT-MCTAG for German with an RCG-based parser

Author: Dellert Johannes
Kallmeyer Laura
Lichte Timm
Maier Wolfgang
Parmentier Yannick
Publication venue
Publication date: 01/01/2008
Field of study

Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an actual fragment of a core multicomponent tree-adjoining grammar with tree tuples (TT-MCTAG) for German developed using this framework. This framework combines a metagrammar compiler and a parser based on range concatenation grammar (RCG) to respectively check the consistency and the correction of the grammar. The German grammar being developed within this framework already deals with a wide range of scrambling and extraction phenomena

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Description of the CUDF Format

Author: Treinen Ralf
Zacchiroli Stefano
Publication venue
Publication date: 01/11/2008
Field of study

This document contains several related specifications, together they describe the document formats related to the solver competition which will be organized by Mancoosi. In particular, this document describes: - DUDF (Distribution Upgradeability Description Format), the document format to be used to submit upgrade problem instances from user machines to a (distribution-specific) database of upgrade problems; - CUDF (Common Upgradeability Description Format), the document format used to encode upgrade problems, abstracting over distribution-specific details. Solvers taking part in the competition will be fed with input in CUDF format

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

The comprehension revolution : a twenty-year history of process and practice related to reading comprehension

Author: Bruce Bertram C.
Pearson P. David
Publication venue: Cambridge, Mass. : Bolt Beranek and Newman, Inc.
Publication date: 01/02/1985
Field of study

Includes bibliographie

Illinois Digital Environment for Access to Learning and Scholarship Repository

An integrated architecture for shallow and deep processing

Author: Becker Markus
Crysmann Berthold
Frank Anette
Kiefer Bernd
Krieger Hans-Ulrich
Müller Stefan
Neumann Günter
Piskorski Jakub
Schäfer Ulrich
Siegel Melanie
Uszkoreit Hans
Xu Feiyu
Publication venue
Publication date: 21/12/2011
Field of study

We present an architecture for the integration of shallow and deep NLP components which is aimed at flexible combination of different language technologies for a range of practical current and future applications. In particular, we describe the integration of a high-level HPSG parsing system with different high-performance shallow components, ranging from named entity recognition to chunk parsing and shallow clause recognition. The NLP components enrich a representation of natural language text with layers of new XML meta-information using a single shared data structure, called the text chart. We describe details of the integration methods, and show how information extraction and language checking applications for realworld German text benefit from a deep grammatical analysis

Hochschulschriftenserver - Universität Frankfurt am Main

An Abstract Machine for Unification Grammars

Author: Wintner Shuly
Publication venue
Publication date: 01/01/1997
Field of study

This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

arXiv.org e-Print Archive

CiteSeerX

Automatic Extraction of Subcategorization from Corpora

Author: Briscoe Ted
Carroll John
Publication venue
Publication date: 01/01/1997
Field of study

We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount.Comment: 8 pages; requires aclap.sty. To appear in ANLP-9

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

Can Subcategorisation Probabilities Help a Statistical Parser?

Author: Briscoe Ted
Carroll John
Minnen Guido
Publication venue
Publication date: 01/01/1998
Field of study

Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st

arXiv.org e-Print Archive

CiteSeerX

Sussex Research Online

Instructional Basis of Libra

Author: Farris Michael
Fischer Robert
Publication venue: 'The University of Kansas'
Publication date: 15/02/1995
Field of study

The University of Kansas: Journals@KU

Biodiversity Informatics