608 research outputs found
Using definite clause grammars to build a global system for analyzing collections of documents
International audienceCollections of documents are sets of heterogeneous documents, like a specific ancient book series, having proper structural and semantic properties linking them. A particular collection contains document images with specific physical layouts, like text pages or full-page illustrations, appearing in a specific order. Its contents, like journal articles, may be shared by several pages, not necessary following, producing strong dependencies between pages interpretations.In order to build an analysis system which can bring contextual information from the collection to the appropriate recognition modules for each page, we propose to express the structural and the semantic properties of a collection with a definite clause grammar. This is made possible by representing collections as streams of document descriptors, and by using extensions to the formalism we present here. We are then able to automatically generate a parser dedicated to a collection. Beside allowing structural variations and complex information flows, we also show that this approach enables the design of analysis stages, on a document or a set of documents. The interest of context usage is illustrated with several examples and their appropriate formalization in this framework
An Efficient Implementation of the Head-Corner Parser
This paper describes an efficient and robust implementation of a
bi-directional, head-driven parser for constraint-based grammars. This parser
is developed for the OVIS system: a Dutch spoken dialogue system in which
information about public transport can be obtained by telephone.
After a review of the motivation for head-driven parsing strategies, and
head-corner parsing in particular, a non-deterministic version of the
head-corner parser is presented. A memoization technique is applied to obtain a
fast parser. A goal-weakening technique is introduced which greatly improves
average case efficiency, both in terms of speed and space requirements.
I argue in favor of such a memoization strategy with goal-weakening in
comparison with ordinary chart-parsers because such a strategy can be applied
selectively and therefore enormously reduces the space requirements of the
parser, while no practical loss in time-efficiency is observed. On the
contrary, experiments are described in which head-corner and left-corner
parsers implemented with selective memoization and goal weakening outperform
`standard' chart parsers. The experiments include the grammar of the OVIS
system and the Alvey NL Tools grammar.
Head-corner parsing is a mix of bottom-up and top-down processing. Certain
approaches towards robust parsing require purely bottom-up processing.
Therefore, it seems that head-corner parsing is unsuitable for such robust
parsing techniques. However, it is shown how underspecification (which arises
very naturally in a logic programming environment) can be used in the
head-corner parser to allow such robust parsing techniques. A particular robust
parsing model is described which is implemented in OVIS.Comment: 31 pages, uses cl.st
Principles and Implementation of Deductive Parsing
We present a system for generating parsers based directly on the metaphor of
parsing as deduction. Parsing algorithms can be represented directly as
deduction systems, and a single deduction engine can interpret such deduction
systems so as to implement the corresponding parser. The method generalizes
easily to parsers for augmented phrase structure formalisms, such as
definite-clause grammars and other logic grammar formalisms, and has been used
for rapid prototyping of parsing algorithms for a variety of formalisms
including variants of tree-adjoining grammars, categorial grammars, and
lexicalized context-free grammars.Comment: 69 pages, includes full Prolog cod
Recommended from our members
Comparison of Surface Language Generators: A Case Study in Choice of Connectives
Language generation systems have used a variety of grammatical formalisms for producing syntactic structure and yet, there has been little research evaluating the formalisms for the specifics of the generation task. In our work at Columbia we have primarily used a unification based formalism, a Functional Unification Grammar (FUG) [Kay 79] and have found it well suited for many of the generation tasks we have addressed. Over the course of the past 5 years we have also explored the use of various off-the-shelf parsing formalisms, including an Augmented Transition Network (ATN) [Woods 70]. a Bottom-Up Chan Parser (BUP) [Finin 84], and a Declarative Clause Grammar (DCG) [Pereira & Warren 80]. In this paper, we identify the characteristics of FDG that we find useful for generation and contrast these with characteristics of the parsing formalisms and with other formalisms that are typically used for generation
Recommended from our members
A Contrastive Study of Functional Unification Grammar for Surface Language Generation: A Case Study in Choice of Connectives
Language generation systems have used a variety of grammatical formalisms for producing syntactic structure and yet, there has been little research evaluating the formalisms for the specifics of the generation task. In our work at Columbia we have primarily used a unification based formalism, a Functional Unification Grammar (FUG) [Kay 79] and have found it well suited for many of the generation tasks we have addressed. Over the course of the past 5 years we have also explored the use of various off-the-shelf parsing formalisms, including an Augmented Transition Network (ATN) [Woods 701], a Bottom-Up Chart Parser (SUP) [Finin 84], and a Declarative Clause Grammar (DCG) [Pereira and Warren 80]. In contrast, we have found that parsing formalisms do not have the same benefits for the generation task
Robust Grammatical Analysis for Spoken Dialogue Systems
We argue that grammatical analysis is a viable alternative to concept
spotting for processing spoken input in a practical spoken dialogue system. We
discuss the structure of the grammar, and a model for robust parsing which
combines linguistic sources of information and statistical sources of
information. We discuss test results suggesting that grammatical processing
allows fast and accurate processing of spoken input.Comment: Accepted for JNL
SLR inference: An inference system for fixed-mode logic programs, based on SLR parsing
AbstractDefinite-clause grammars (DCGs) generalize context-free grammars in such a way that Prolog can be used as a parser in the presence of context-sensitive information. Prolog's proof procedure, however, is based on backtracking, which may be a source of inefficiency. Parsers for context-free grammars that use backtracking, for instance, were soon replaced by more efficient methods, such as LR parsers. This suggests incorporating the principles underlying LR parsing into a parser for grammars with context-sensitive information. We present a technique that applies a transformation to the program/grammar by adding leaves to the proof/parse trees and placing the contextual information in such leaves. An inference system is then easily obtained from an LR parser, since only the parts dealing with terminals (which appear at the leaves) must be modified. Although our method is restricted to programs with fixed modes, it may be preferable to DCGs under Prolog for some programs
- …