43 research outputs found
Syntactic analysis of LR(k) languages
PhD ThesisA method of syntactic analysis, termed LA(m)LR(k), is discussed
theoretically. Knuth's LR(k) algorithm is included as the special
case m = k. A simpler variant, SLA(m)LR(k) is also described, which
in the case SLA(k)LR(O) is equivalent to the SLR(k) algorithm as
defined by DeRemer. Both variants have the LR(k) property of
immediate detection of syntactic errors.
The case m = 1 k = 0 is examined in detail, when the methods
provide a practical parsing technique of greater generality than
precedence methods in current use. A formal comparison is made with
the weak precedence algorithm.
The implementation of an SLA(1)LR(O) parser (SLR) is described,
involving numerous space and time optimisations. Of importance is a
technique for bypassing unnecessary steps in a syntactic derivation.
Direct comparisons are made, primarily with the simple precedence
parser of the highly efficient Stanford AlgolW compiler, and confirm
the practical feasibility of the SLR parser.The Science Research Council
Efficient Semiring-Weighted Earley Parsing
This paper provides a reference description, in the form of a deduction
system, of Earley's (1970) context-free parsing algorithm with various
speed-ups. Our presentation includes a known worst-case runtime improvement
from Earley's , which is unworkable for the large grammars that
arise in natural language processing, to , which matches the
runtime of CKY on a binarized version of the grammar . Here is the
length of the sentence, is the number of productions in , and is
the total length of those productions. We also provide a version that achieves
runtime of with when the grammar is represented
compactly as a single finite-state automaton (this is partly novel). We
carefully treat the generalization to semiring-weighted deduction,
preprocessing the grammar like Stolcke (1995) to eliminate deduction cycles,
and further generalize Stolcke's method to compute the weights of sentence
prefixes. We also provide implementation details for efficient execution,
ensuring that on a preprocessed grammar, the semiring-weighted versions of our
methods have the same asymptotic runtime and space requirements as the
unweighted methods, including sub-cubic runtime on some grammars.Comment: Main conference long paper at ACL 202
音声翻訳における文解析技法について
本文データは平成22年度国立国会図書館の学位論文(博士)のデジタル化実施により作成された画像ファイルを基にpdf変換したものである京都大学0048新制・論文博士博士(工学)乙第8652号論工博第2893号新制||工||968(附属図書館)UT51-94-R411(主査)教授 長尾 真, 教授 堂下 修司, 教授 池田 克夫学位規則第4条第2項該当Doctor of EngineeringKyoto UniversityDFA
Parsing Schemata
Parsing schemata provide a general framework for specication, analysis and comparison of (sequential and/or parallel) parsing algorithms. A grammar specifies implicitly what the valid parses of a sentence are; a parsing algorithm specifies explicitly how to compute these. Parsing schemata form a well-defined level of abstraction in between grammars and parsing algorithms. A parsing schema specifies the types of intermediate results that can be computed by a parser, and the rules that allow to expand a given set of such results with new results. A parsing schema does not specify the data structures, control structures, and (in case of parallel processing)\ud
communication structures that are to be used by a parser.\ud
Part I, Exposition, gives a general introduction to the ideas that are worked out in the following parts.\ud
Part II, Foundation, unfolds a mathematical theory of parsing schemata. Different kinds of relations between parsing schemata are formally introduced and illustrated with examples drawn from the parsing literature.\ud
Part III, Application, discusses a series of applications of parsing schemata.\ud
- Feature percolation in unification grammar parsing can be described in an elegant, legible notation.\ud
- Because of the absence of algorithmic detail, parsing schemata can be used to get a formal grip on highly complicated algorithms. We give substance to this claim by means of a thorough analysis of Left-Corner and Head-Corner chart parsing.\ud
- As an example of structural similarity of parsers, despite differences in form and appearance, we show that the underlying parsing schemata of Earley's algorithm and Tomita's algorithm are virtually identical. Using this structural correspondence we can obtain a novel parallel parser by cross-fertilizing a parallel Earley parser with Tomita's graph-structured stack.\ud
- Parsing schemata can be implemented straightforwardly by boolean circuits. This means that, in principle, parsing schemata can be coded directly into hardware.\ud
Part IV, Perspective, discusses the prospects for natural language parsing applications and draws some conclusions. An important observation is that the theoretical and practical part of the book reinforce each other. The proposed framework is abstract enough to allow a thorough mathematical treatment and practical enough to allow rewriting a variety of real parsing algorithms (i.e. seriously proposed in the literature, not toy examples)\ud
in a clear and coherent way
LR(k) sparse-parsers and their optimisation
PhD ThesisA method of syntactic analysis is developed which . .
is believed to surpass all known competitors in all major
respects.
I
The method is based upon that associated with the
LR(k) grammars but is faster because it bypasses all
reduction steps concerned with 'chain' productions. These
are freely selected productions which are considered
semantically irrelevant and whose right parts consist of
just a single symbol. The parses produced by the method
are 'sparse' in that they contain no references to chain
productions - they are termed 'chain-free' parses.
The CFLR(k) grammars are introduced as the largest
class which can be -Chain-F-ree parsed from -Le-ft to Right while looking ~ symbols ahead of the current point of the
parse. The properties of these grammars are examined in
detail and their relationship to the conventional LR(k)
grammars is explored. Techniques are presented for testing
grammars for the CFLR(k) property and for constructing
chain-free parsers for those grammars possessing the
property. Methods are also presented for. converting
ordinary LR(k) parsers into chain-free parsers.
CFLR(k) parsers are more widely applicable than
their LR(k) counterparts, are faster 'and provide the same
excellent detection of syntactic errors. Unfortunately they
also tend to be rather larger. A 'simple optimization is
presented which completely'overcomes this single disadvantage
without sacrificing any of the advantages of the
method.
These theoretical techniques are adapted to provide
truly practical chain-free parsers based on the conventional
SLR and,LALR parsing methods. Detailed consideration
is given to use of 'default reductions' and related
techniques for achd.evfng compact representations of these
parsers. The resulting chain-free parsers are not only
faster than their ordinary counterparts, but probably
smaller too. We believe their advantages are such that they
should substantially replace other parsing methods currently
used in programming language compilers