Search CORE

220 research outputs found

Substring parsing for arbitrary context-free grammars

Author: Koorn J.W.C.
Rekers J. G. (Jan)
Publication venue: CWI
Publication date: 01/01/1990
Field of study

Syntax Error Handling in Scannerless Generalized LR Parsers

Author: Valkering R.
Publication venue: University of Amsterdam
Publication date: 01/08/2007
Field of study

This thesis is about a master's project as part of the one year master study 'Software-engineering'. This project is about methods for improving the quality of reporting and handling of syntax errors that are produced by a scannerless generalized left-to-right rightmost (SGLR) parser, and is done at Centrum voor Wiskunde en Informatica (CWI) in Amsterdam. SGLR is a parsing algorithm developed as part of Generic Language Technol- ogy Project at SEN1, one of the themes at CWI. SGLR is based on the GLR algorithm developed by Tomita. SGLR parsers are able to recognize arbitrary context-free grammars, which enables grammar modularization. Because SGLR does not use a separate scan- ner, also layout and comments are incorporated into the parse tree. This makes SGLR a powerful tool for code analysis and code transformations. A drawback is the way SGLR handles syntax errors. When a syntax error is detected, the current implementation of SGLR halts the parsing process and reports back to the user the point of error detection only. The text at the point of error detection is not necessarily the text that has to be changed to repair the error. This thesis describes three kinds of information that could be reported to the user, and how they could be derived from the parse process when an error is detected. These are: - The structure of the already parsed part of the input in the form of a partial parse tree. - A listing of expected symbols; those tokens or token sequences that are accept- able instead of the erroneous text. - The current parser state which could be translated into language dependent informative messages. Also two ways of recovering from an error condition are described. These are non-correcting recovery methods that enable SGLR to always return a parse tree that can be unparsed into the original input sentence. - A method that halts parsing but incorporates the remainder of the input into the parse tree. - A method that resumes parsing by means of substring parsing. During the course of the project the described approaches have been imple- mented and incorporated in the implementation of SGLR as used by the Meta- Environment, some fully, some more or less prototyped

CWI's Institutional Repository

Parallel parsing made practical

Author: Barenghi Alessandro
CRESPI REGHIZZI Stefano
Mandrioli Dino
Panella Federica
Pradella Matteo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multi-core machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing

Archivio istituzionale della ricerca - Politecnico di Milano

CHR Grammars

Author: Christiansen Henning
Publication venue
Publication date: 01/01/2004
Field of study

A grammar formalism based upon CHR is proposed analogously to the way Definite Clause Grammars are defined and implemented on top of Prolog. These grammars execute as robust bottom-up parsers with an inherent treatment of ambiguity and a high flexibility to model various linguistic phenomena. The formalism extends previous logic programming based grammars with a form of context-sensitive rules and the possibility to include extra-grammatical hypotheses in both head and body of grammar rules. Among the applications are straightforward implementations of Assumption Grammars and abduction under integrity constraints for language analysis. CHR grammars appear as a powerful tool for specification and implementation of language processors and may be proposed as a new standard for bottom-up grammars in logic programming. To appear in Theory and Practice of Logic Programming (TPLP), 2005Comment: 36 pp. To appear in TPLP, 200

arXiv.org e-Print Archive

Roskilde Universitet

Providing rapid feedback in generated modular language environments

Author: Eelco Visser
Emma Nilsson-Nyman
Gamma E.
Kats L. C. L.
Kiczales G.
Krahn H.
Lavie A.
Lennart C.L. Kats
Maartje de Jonge
Tomita M.
Valkering R.
van Deursen A.
Visser E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Parser generation for interactive environments

Author: Rekers J. G. (Jan)
Publication venue
Publication date: 31/01/1992
Field of study

CWI's Institutional Repository

Automatic error recovery for LR parsers in theory and practice

Author: Dain Julia Anne
Publication venue
Publication date
Field of study

This thesis argues the need for good syntax error handling schemes in language translation systems such as compilers, and for the automatic incorporation of such schemes into parser-generators. Syntax errors are studied in a theoretical framework and practical methods for handling syntax errors are presented. The theoretical framework consists of a model for syntax errors based on the concept of a minimum prefix-defined error correction,a sentence obtainable from an erroneous string by performing edit operations at prefix-defined (parser defined) errors. It is shown that for an arbitrary context-free language, it is undecidable whether a better than arbitrary choice of edit operations can be made at a prefix-defined error. For common programming languages,it is shown that minimum-distance errors and prefix-defined errors do not necessarily coincide, and that there exists an infinite number of programs that differ in a single symbol only; sets of equivalent insertions are exhibited. Two methods for syntax error recovery are, presented. The methods are language independent and suitable for automatic generation. The first method consists of two stages, local repair followed if necessary by phrase-level repair. The second method consists of a single stage in which a locally minimum-distance repair is computed. Both methods are developed for use in the practical LR parser-generator yacc, requiring no additional specifications from the user. A scheme for the automatic generation of diagnostic messages in terms of the source input is presented. Performance of the methods in practice is evaluated using a formal method based on minimum-distance and prefix-defined error correction. The methods compare favourably with existing methods for error recovery

Warwick Research Archives Portal Repository

音声翻訳における文解析技法について

Author: Tomita Masaru
Publication venue: 京都大学
Publication date: 23/07/1994
Field of study

本文データは平成22年度国立国会図書館の学位論文(博士)のデジタル化実施により作成された画像ファイルを基にpdf変換したものである京都大学0048新制・論文博士博士(工学)乙第8652号論工博第2893号新制||工||968(附属図書館)UT51-94-R411(主査)教授長尾真, 教授堂下修司, 教授池田克夫学位規則第4条第2項該当Doctor of EngineeringKyoto UniversityDFA

Kyoto University Research Information Repository

Incremental parsing algorithms for speech-editing mathematics and computer code

Author: Isaac Marina Jennifer
Publication venue: Kingston University
Publication date
Field of study

The provision of speech control for editing plain language text has existed for a long time, but does not extend to structured content such as mathematics. The requirements of a user interface for a spoken mathematics editor are explored through the lens of an intuitive natural user interface (NUI) for speech control, the desired properties of which are based on a combination of existing literature on NUIs and intuitive user interfaces. An important aspect of an intuitive NUI is timely update of display of the content in response to editing actions. This is not feasible using batch parsing alone, and this issue will be more serious for larger documents such as computer program code. The solution is an incremental parser designed to work with operator precedence (OP) grammars. The contribution to knowledge provided by this thesis is to improve the efficiency in terms of processing time, of the OP incremental parsing algorithm developed by Heeman, and extend it to handle the distfix (mixfix) operators described by Attanayake to model brackets and mathematical functions. This is implemented successfully for the TalkMaths system and shows a greatly reduced response time compared with using batch scanning and parsing alone. The author is not aware of any other incremental OP parser that handles such operators. Furthermore, a proposal is made for modifications to the data structures produced by Attanayake's parser, along with appropriate adjustments to the incremental parser, that will in the future, facilitate application of OP grammar to program code or other structured content by changing the definition of its content language

Kingston University Research Repository