7,509 research outputs found
Graph Interpolation Grammars: a Rule-based Approach to the Incremental Parsing of Natural Languages
Graph Interpolation Grammars are a declarative formalism with an operational
semantics. Their goal is to emulate salient features of the human parser, and
notably incrementality. The parsing process defined by GIGs incrementally
builds a syntactic representation of a sentence as each successive lexeme is
read. A GIG rule specifies a set of parse configurations that trigger its
application and an operation to perform on a matching configuration. Rules are
partly context-sensitive; furthermore, they are reversible, meaning that their
operations can be undone, which allows the parsing process to be
nondeterministic. These two factors confer enough expressive power to the
formalism for parsing natural languages.Comment: 41 pages, Postscript onl
On the Relation between Context-Free Grammars and Parsing Expression Grammars
Context-Free Grammars (CFGs) and Parsing Expression Grammars (PEGs) have
several similarities and a few differences in both their syntax and semantics,
but they are usually presented through formalisms that hinder a proper
comparison. In this paper we present a new formalism for CFGs that highlights
the similarities and differences between them. The new formalism borrows from
PEGs the use of parsing expressions and the recognition-based semantics. We
show how one way of removing non-determinism from this formalism yields a
formalism with the semantics of PEGs. We also prove, based on these new
formalisms, how LL(1) grammars define the same language whether interpreted as
CFGs or as PEGs, and also show how strong-LL(k), right-linear, and LL-regular
grammars have simple language-preserving translations from CFGs to PEGs
On Left and Right Dislocation: A Dynamic Perspective
The paper argues that by modelling the incremental and left-right process of interpretation as a process of growth of logical form (representing logical forms as trees), an integrated typology of left-dislocation and right-dislocation phenomena becomes available, bringing out not merely the similarities between these types of phenomena, but also their asymmetry. The data covered include hanging topic left dislocation, clitic left dislocation, left dislocation, pronoun doubling, expletives, extraposition, and right node raising, with each set of data analysed in terms of general principles of tree growth. In the light of the success in providing a characterisation of the asymmetry between left and right periphery phenomena, a result not achieved in more wellknown formalisms, the paper concludes that grammar formalisms should model the dynamics of language processing in time.Articl
Precedence Automata and Languages
Operator precedence grammars define a classical Boolean and deterministic
context-free family (called Floyd languages or FLs). FLs have been shown to
strictly include the well-known visibly pushdown languages, and enjoy the same
nice closure properties. We introduce here Floyd automata, an equivalent
operational formalism for defining FLs. This also permits to extend the class
to deal with infinite strings to perform for instance model checking.Comment: Extended version of the paper which appeared in Proceedings of CSR
2011, Lecture Notes in Computer Science, vol. 6651, pp. 291-304, 2011.
Theorem 1 has been corrected and a complete proof is given in Appendi
CHR Grammars
A grammar formalism based upon CHR is proposed analogously to the way
Definite Clause Grammars are defined and implemented on top of Prolog. These
grammars execute as robust bottom-up parsers with an inherent treatment of
ambiguity and a high flexibility to model various linguistic phenomena. The
formalism extends previous logic programming based grammars with a form of
context-sensitive rules and the possibility to include extra-grammatical
hypotheses in both head and body of grammar rules. Among the applications are
straightforward implementations of Assumption Grammars and abduction under
integrity constraints for language analysis. CHR grammars appear as a powerful
tool for specification and implementation of language processors and may be
proposed as a new standard for bottom-up grammars in logic programming.
To appear in Theory and Practice of Logic Programming (TPLP), 2005Comment: 36 pp. To appear in TPLP, 200
Generalizing input-driven languages: theoretical and practical benefits
Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks
to their simplicity they enjoy various nice algebraic and logic properties that
have been successfully exploited in many application fields. Practically all of
their related problems are decidable, so that they support automatic
verification algorithms. Also, they can be recognized in real-time.
Context-free languages (CFL) are another major family well-suited to
formalize programming, natural, and many other classes of languages; their
increased generative power w.r.t. RL, however, causes the loss of several
closure properties and of the decidability of important problems; furthermore
they need complex parsing algorithms. Thus, various subclasses thereof have
been defined with different goals, spanning from efficient, deterministic
parsing to closure properties, logic characterization and automatic
verification techniques.
Among CFL subclasses, so-called structured ones, i.e., those where the
typical tree-structure is visible in the sentences, exhibit many of the
algebraic and logic properties of RL, whereas deterministic CFL have been
thoroughly exploited in compiler construction and other application fields.
After surveying and comparing the main properties of those various language
families, we go back to operator precedence languages (OPL), an old family
through which R. Floyd pioneered deterministic parsing, and we show that they
offer unexpected properties in two fields so far investigated in totally
independent ways: they enable parsing parallelization in a more effective way
than traditional sequential parsers, and exhibit the same algebraic and logic
properties so far obtained only for less expressive language families
- …