Search CORE

29 research outputs found

A General, Sound and Efficient Natural Language Parsing Algorithm based on Syntactic Constraints Propagation

Author: Quesada Moreno José Francisco
Publication venue: AEPIA: Asociación Española para la Inteligencia Artificial
Publication date: 01/01/1997
Field of study

This paper presents a new context-free parsing algorithm based on a bidirectional strictly horizontal strategy which incorporates strong top–down predictions (deriva- tions and adjacencies). From a functional point of view, the parser is able to propagate syntactic constraints reducing parsing ambiguity. From a computational perspective, the algorithm includes different techniques aimed at the improvement of the manipu- lation and representation of the structures used

idUS. Depósito de Investigación Universidad de Sevilla

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities

Author: Stolcke Andreas
Publication venue
Publication date: 01/01/1993
Field of study

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. It can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational Linguistics 2

arXiv.org e-Print Archive

CiteSeerX

Formal Languages and Compilation

Author: A. Morzenti
L. Breveglieri
S. Crespi Reghizzi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This textbook describes the essential principles and methods used for defining the syntax of artificial languages, and for designing efficient parsing algorithms and syntax-directed translators with semantic attributes. A comprehensive selection of topics is presented within a rigorous, unified framework, illustrated by numerous practical examples. Features and topics: presents a novel conceptual approach to parsing algorithms that applies to extended BNF grammars, together with a parallel parsing algorithm; supplies supplementary teaching tools, including course slides and exercises with solutions, at an associated website; unifies the concepts and notations used in different approaches, enabling an extended coverage of methods with a reduced number of definitions; systematically discusses ambiguous forms, allowing readers to avoid pitfalls when designing grammars; describes all algorithms in pseudocode, so that detailed knowledge of a specific programming language is not necessary; makes extensive usage of theoretical models of automata, transducers and formal grammars; includes concise coverage of algorithms for processing regular expressions and finite automata; and introduces static program analysis based on flow equations. This clearly-written, classroom-tested textbook is an ideal guide to the fundamentals of this field for advanced undergraduate and graduate students in computer science and computer engineering. Some background in programming is required, and readers should also be familiar with basic set theory, algebra and logic

Archivio istituzionale della ricerca - Politecnico di Milano

An Efficient Implementation of the Head-Corner Parser

Author: van Noord Gertjan
Publication venue
Publication date: 01/01/1996
Field of study

This paper describes an efficient and robust implementation of a bi-directional, head-driven parser for constraint-based grammars. This parser is developed for the OVIS system: a Dutch spoken dialogue system in which information about public transport can be obtained by telephone. After a review of the motivation for head-driven parsing strategies, and head-corner parsing in particular, a non-deterministic version of the head-corner parser is presented. A memoization technique is applied to obtain a fast parser. A goal-weakening technique is introduced which greatly improves average case efficiency, both in terms of speed and space requirements. I argue in favor of such a memoization strategy with goal-weakening in comparison with ordinary chart-parsers because such a strategy can be applied selectively and therefore enormously reduces the space requirements of the parser, while no practical loss in time-efficiency is observed. On the contrary, experiments are described in which head-corner and left-corner parsers implemented with selective memoization and goal weakening outperform `standard' chart parsers. The experiments include the grammar of the OVIS system and the Alvey NL Tools grammar. Head-corner parsing is a mix of bottom-up and top-down processing. Certain approaches towards robust parsing require purely bottom-up processing. Therefore, it seems that head-corner parsing is unsuitable for such robust parsing techniques. However, it is shown how underspecification (which arises very naturally in a logic programming environment) can be used in the head-corner parser to allow such robust parsing techniques. A particular robust parsing model is described which is implemented in OVIS.Comment: 31 pages, uses cl.st

arXiv.org e-Print Archive

CiteSeerX

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

音声翻訳における文解析技法について

Author: Tomita Masaru
Publication venue: 京都大学
Publication date: 23/07/1994
Field of study

本文データは平成22年度国立国会図書館の学位論文(博士)のデジタル化実施により作成された画像ファイルを基にpdf変換したものである京都大学0048新制・論文博士博士(工学)乙第8652号論工博第2893号新制||工||968(附属図書館)UT51-94-R411(主査)教授長尾真, 教授堂下修司, 教授池田克夫学位規則第4条第2項該当Doctor of EngineeringKyoto UniversityDFA

Kyoto University Research Information Repository

One Parser to Rule Them All

Author: Afroozeh A.
Afroozeh A.
Clarke K.
DeRemer F. L.
Erdweg S.
Johnson M.
Johnstone A.
M. G.
Tomita M.
Watt D. A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Despite the long history of research in parsing, constructing parsers for real programming languages remains a difficult and painful task. In the last decades, different parser generators emerged to allow the construction of parsers from a BNF-like specification. However, still today, many parsers are handwritten, or are only partly generated, and include various hacks to deal with different peculiarities in programming languages. The main problem is that current declarative syntax definition techniques are based on pure context-free grammars, while many constructs found in programming languages require context information. In this paper we propose a parsing framework that embraces context information in its core. Our framework is based on data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We present an implementation of our framework on top of the Generalized LL (GLL) parsing algorithm, and show how common idioms in syntax of programming languages such as (1) lexical disambiguation filters, (2) operator precedence, (3) indentation-sensitive rules, and (4) conditional preprocessor directives can be mapped to data-dependent grammars. We demonstrate the initial experience with our framework, by parsing more than 20000 Java, C#, Haskell, and OCaml source files

Crossref

CWI's Institutional Repository

INRIA a CCSD electronic archive server

GALENA: tabular DCG parsing for natural languages

Author: Alonso Miguel A.
Cabrero David
Graña Gil Jorge
Vilares Ferro Manuel
Publication venue
Publication date: 01/01/1998
Field of study

[Abstract] We present a definite clause based parsing environment for natural languages, whose operational model is the dynamic interpretation of logical push-down automata. We attempt to briefly explain our design decisions in terms of a set of properties that practical natural language processing systems should incorporate. The aim is to show both the advantages and the drawbacks of our approach.España. Gobierno; HF96-36Xunta de Galcia; XUGA10505B96Xunta de Galcia; XUGA20402B9

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas