42 research outputs found
Generating Semantic Graph Corpora with Graph Expansion Grammar
We introduce Lovelace, a tool for creating corpora of semantic graphs. The
system uses graph expansion grammar as a representational language, thus
allowing users to craft a grammar that describes a corpus with desired
properties. When given such grammar as input, the system generates a set of
output graphs that are well-formed according to the grammar, i.e., a graph
bank. The generation process can be controlled via a number of configurable
parameters that allow the user to, for example, specify a range of desired
output graph sizes. Central use cases are the creation of synthetic data to
augment existing corpora, and as a pedagogical tool for teaching formal
language theory.Comment: In Proceedings NCMA 2023, arXiv:2309.0733
Comparing and evaluating extended Lambek calculi
Lambeks Syntactic Calculus, commonly referred to as the Lambek calculus, was
innovative in many ways, notably as a precursor of linear logic. But it also
showed that we could treat our grammatical framework as a logic (as opposed to
a logical theory). However, though it was successful in giving at least a basic
treatment of many linguistic phenomena, it was also clear that a slightly more
expressive logical calculus was needed for many other cases. Therefore, many
extensions and variants of the Lambek calculus have been proposed, since the
eighties and up until the present day. As a result, there is now a large class
of calculi, each with its own empirical successes and theoretical results, but
also each with its own logical primitives. This raises the question: how do we
compare and evaluate these different logical formalisms? To answer this
question, I present two unifying frameworks for these extended Lambek calculi.
Both are proof net calculi with graph contraction criteria. The first calculus
is a very general system: you specify the structure of your sequents and it
gives you the connectives and contractions which correspond to it. The calculus
can be extended with structural rules, which translate directly into graph
rewrite rules. The second calculus is first-order (multiplicative
intuitionistic) linear logic, which turns out to have several other,
independently proposed extensions of the Lambek calculus as fragments. I will
illustrate the use of each calculus in building bridges between analyses
proposed in different frameworks, in highlighting differences and in helping to
identify problems.Comment: Empirical advances in categorial grammars, Aug 2015, Barcelona,
Spain. 201
Multiple Context-Free Tree Grammars: Lexicalization and Characterization
Multiple (simple) context-free tree grammars are investigated, where "simple"
means "linear and nondeleting". Every multiple context-free tree grammar that
is finitely ambiguous can be lexicalized; i.e., it can be transformed into an
equivalent one (generating the same tree language) in which each rule of the
grammar contains a lexical symbol. Due to this transformation, the rank of the
nonterminals increases at most by 1, and the multiplicity (or fan-out) of the
grammar increases at most by the maximal rank of the lexical symbols; in
particular, the multiplicity does not increase when all lexical symbols have
rank 0. Multiple context-free tree grammars have the same tree generating power
as multi-component tree adjoining grammars (provided the latter can use a
root-marker). Moreover, every multi-component tree adjoining grammar that is
finitely ambiguous can be lexicalized. Multiple context-free tree grammars have
the same string generating power as multiple context-free (string) grammars and
polynomial time parsing algorithms. A tree language can be generated by a
multiple context-free tree grammar if and only if it is the image of a regular
tree language under a deterministic finite-copying macro tree transducer.
Multiple context-free tree grammars can be used as a synchronous translation
device.Comment: 78 pages, 13 figure
On the Complexity of Free Word Orders
International audienceWe propose some extensions of mildly context-sensitive for- malisms whose aim is to model free word orders in natural languages. We give a detailed analysis of the complexity of the formalisms we propose
Minimalist Grammars in the Light of Logic
In this paper, we aim at understanding the derivations of minimalist grammars without the shortest move constraint. This leads us to study the relationship of those derivations with logic. In particular we show that the membership problem of minimalist grammars without the shortest move constraint is as difficult as provability in Multiplicative Exponential Linear Logic. As a byproduct, this result gives us a new representation of those derivations with linear -terms. We show how to interpret those terms in a homomorphic way so as to recover the sentence they analyse. As the homorphisms we describe are rather evolved, we turn to a proof-net representation and explain how Monadic Second Order Logic and related techniques allow us both to define those proof-nets and to retrieve the sentence they analyse
Connectionist learning of regular graph grammars
This paper presents a new connectionist approach to grammatical inference. Using only positive examples, the algorithm learns regular graph grammars, representing two-dimensional iterative structures drawn on a discrete Cartesian grid. This work is intended as a case study in connectionist symbol processing andgeometric concept formation. A grammar is represented by a self-configuring connectionist network that is analogous to a transition diagram except that it can deal with graph grammars as easily as string grammars. Learning starts with a trivial grammar, expressing nogrammatical knowledge, which is then refined, by a process of successive node splitting and merging, into a grammar adequate to describe the population of input patterns. In conclusion, I argue that the connectionist style of computation is, in some ways, better suited than sequential computation to the task of representing and manipulating recursive structures