57 research outputs found
Generating Semantic Graph Corpora with Graph Expansion Grammar
We introduce Lovelace, a tool for creating corpora of semantic graphs. The
system uses graph expansion grammar as a representational language, thus
allowing users to craft a grammar that describes a corpus with desired
properties. When given such grammar as input, the system generates a set of
output graphs that are well-formed according to the grammar, i.e., a graph
bank. The generation process can be controlled via a number of configurable
parameters that allow the user to, for example, specify a range of desired
output graph sizes. Central use cases are the creation of synthetic data to
augment existing corpora, and as a pedagogical tool for teaching formal
language theory.Comment: In Proceedings NCMA 2023, arXiv:2309.0733
Methods for taking semantic graphs apart and putting them back together again
The thesis develops a competitive compositional semantic parser for Abstract Meaning Representation (AMR). This approach combines a neural model with mechanisms that echo ideas from compositional semantic construction in a new, simple dependency structure. The thesis first tackles the task of generating structured training data necessary for a compositional approach, by developing the linguistically motivated AM algebra. Encoding the terms over the AM algebra as dependency trees yields a simple semantic parsing model where neural tagging and dependency models predict interpretable, meaningful operations that construct the AMR.Diese Dissertation entwickelt einen kompositionellen semantischen Parser fĂŒr den Graphformalismus Abstract Meaning Representation (AMR). Der Ansatz kombiniert ein neuronales Modell mit Mechanismen, die Ideen der klassischen kompositionellen semantischen Konstruktion widerspiegeln. Die Arbeit geht zunĂ€chst das Problem an, strukturierte latente Trainingsdaten zu erzeugen die fĂŒr den kompositionellen Ansatz nötig sind. FĂŒr diesen Zweck wird die linguistisch motivierte AM Algebra entwickelt. Indem die Terme der AM Algebra als DependenzbĂ€ume ausgedrĂŒckt werden, erhalten wir ein Modell fĂŒr semantisches Parsen, in dem neuronale Tagging- und Dependenzmodelle interpretierbare, aussagekrĂ€ftige Operationen vorhersagen die dann den AMR Graphen erzeugen. Damit erreicht das Modell starke Evaluationsergebnisse und deutliche Verbesserungen gegenĂŒber einem weniger strukturierten Vergleichsmodell.DF
Learning words and syntactic cues in highly ambiguous contexts
The cross-situational word learning paradigm argues that word meanings can be approximated
by word-object associations, computed from co-occurrence statistics between
words and entities in the world. Lexicon acquisition involves simultaneously
guessing (1) which objects are being talked about (the âmeaningâ) and (2) which words
relate to those objects. However, most modeling work focuses on acquiring meanings
for isolated words, largely neglecting relationships between words or physical entities,
which can play an important role in learning.
Semantic parsing, on the other hand, aims to learn a mapping between entire utterances
and compositional meaning representations where such relations are central.
The focus is the mapping between meaning and words, while utterance meanings are
treated as observed quantities.
Here, we extend the joint inference problem of word learning to account for compositional
meanings by incorporating a semantic parsing model for relating utterances
to non-linguistic context. Integrating semantic parsing and word learning permits us to
explore the impact of word-word and concept-concept relations.
The result is a joint-inference problem inherited from the word learning setting
where we must simultaneously learn utterance-level and individual word meanings,
only now we also contend with the many possible relationships between concepts in
the meaning and words in the sentence. To simplify design, we factorize the model into
separate modules, one for each of the world, the meaning, and the words, and merge
them into a single synchronous grammar for joint inference.
There are three main contributions. First, we introduce a novel word learning
model and accompanying semantic parser. Second, we produce a corpus which allows
us to demonstrate the importance of structure in word learning. Finally, we also
present a number of technical innovations required for implementing such a model
Exact Recursive Probabilistic Programming
Recursive calls over recursive data are widely useful for generating
probability distributions, and probabilistic programming allows computations
over these distributions to be expressed in a modular and intuitive way. Exact
inference is also useful, but unfortunately, existing probabilistic programming
languages do not perform exact inference on recursive calls over recursive
data, forcing programmers to code many applications manually. We introduce a
probabilistic language in which a wide variety of recursion can be expressed
naturally, and inference carried out exactly. For instance, probabilistic
pushdown automata and their generalizations are easy to express, and
polynomial-time parsing algorithms for them are derived automatically. We
eliminate recursive data types using program transformations related to
defunctionalization and refunctionalization. These transformations are assured
correct by a linear type system, and a successful choice of transformations, if
there is one, is guaranteed to be found by a greedy algorithm
- âŠ