211 research outputs found
Toward an isomorphic diagram of the Backus-Naur form.
Computer scientists studying formal languages have made use of a variety of representations to both reason, and communicate their ideas to others. Symbolic representations have proved useful for rigorously defining the theoretical objects of the preceding topics; however, research shows that diagrammatic representations are as fundamental to these subjects. Previous research in this domain has typically been interested in studying the semantics that a particular representation is intended to capture. By contrast, this treatise considers the importance of the format of the representations themselves, and how format influences the ability of a person to uncover characteristics, relevant to the problem domain. More specifically, this thesis investigates the established formalisms that have been devised to describe formal languages, and introduces a novel concept, an augmented syntax graph. This graph, an isomorphism of the Backus-Naur form, is shown to have application in visualizing properties that are pertinent to some parsing algorithms
Differentiable Tree Operations Promote Compositional Generalization
In the context of structure-to-structure transformation tasks, learning
sequences of discrete symbolic operations poses significant challenges due to
their non-differentiability. To facilitate the learning of these symbolic
sequences, we introduce a differentiable tree interpreter that compiles
high-level symbolic tree operations into subsymbolic matrix operations on
tensors. We present a novel Differentiable Tree Machine (DTM) architecture that
integrates our interpreter with an external memory and an agent that learns to
sequentially select tree operations to execute the target transformation in an
end-to-end manner. With respect to out-of-distribution compositional
generalization on synthetic semantic parsing and language generation tasks, DTM
achieves 100% while existing baselines such as Transformer, Tree Transformer,
LSTM, and Tree2Tree LSTM achieve less than 30%. DTM remains highly
interpretable in addition to its perfect performance.Comment: ICML 2023. Code available at https://github.com/psoulos/dt
Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
This paper investigates the ability of transformer-based models to learn
structural recursion from examples. Recursion is a universal concept in both
natural and formal languages. Structural recursion is central to the
programming language and formal mathematics tasks where symbolic tools
currently excel beyond neural models, such as inferring semantic relations
between datatypes and emulating program behavior. We introduce a general
framework that nicely connects the abstract concepts of structural recursion in
the programming language domain to concrete sequence modeling problems and
learned models' behavior. The framework includes a representation that captures
the general \textit{syntax} of structural recursion, coupled with two different
frameworks for understanding their \textit{semantics} -- one that is more
natural from a programming languages perspective and one that helps bridge that
perspective with a mechanistic understanding of the underlying transformer
architecture.
With our framework as a powerful conceptual tool, we identify different
issues under various set-ups. The models trained to emulate recursive
computations cannot fully capture the recursion yet instead fit short-cut
algorithms and thus cannot solve certain edge cases that are under-represented
in the training distribution. In addition, it is difficult for state-of-the-art
large language models (LLMs) to mine recursive rules from in-context
demonstrations. Meanwhile, these LLMs fail in interesting ways when emulating
reduction (step-wise computation) of the recursive function.Comment: arXiv admin note: text overlap with arXiv:2305.1469
Generating Programming Environments with Integrated Text and Graphics for VLSI Design Systems
The constant improvements in device integration, the development of new technologies
and the emergence of new design techniques call for flexible, maintainable
and robust software tools. The generic nature of compiler-compiler systems,
with their semi-formal specifications, can help in the construction of those tools.
This thesis describes the Wright editor generator which is used in the synthesis
of language-based graphical editors (LBGEs). An LBGE is a programming
environment where the programs being manipulated denote pictures. Editing
actions can be specified through both textual and graphical interfaces. Editors
generated by the Wright system are specified using the formalism of attribute
grammars.
The major example editor in this thesis, Stick-Wright, is a design entry system
for the construction of VLSI circuits. Stick-Wright is a hierarchical symbolic
layout editor which exploits a combination of text and graphics in an interactive
environment to provide the circuit designer with a tool for experimenting with
circuit topologies. A simpler system, Pict-Wright: a picture drawing system, is
also used to illustrate the attribute grammar specification process.
This thesis aims to demonstrate the efficacy of formal specification in the
generation of software-tools. The generated system Stick-Wright shows that a
text/graphic programming environment can form the basis of a powerful VLSI
design tool, especially with regard to providing the designer with immediate
graphical feedback. Further applications of the LBGE generator approach to
system design are given for a range of VLSI design activities
Bringing machine learning and compositional semantics together
Abstract Computational semantics has long been seen as a field divided between logical and statistical approaches, but this divide is rapidly eroding, with the development of statistical models that learn compositional semantic theories from corpora and databases. This paper presents a simple discriminative learning framework for defining such models and relating them to logical theories. Within this framework, we discuss the task of learning to map utterances to logical forms (semantic parsing) and the task of learning from denotations with logical forms as latent variables. We also consider models that use distributed (e.g., vector) representations rather than logical ones, showing that these can be seen as part of the same overall framework for understanding meaning and structural complexity
Unsupervised structure induction and multimodal grounding
Structured representations build upon symbolic abstraction (e.g., words in natural language and visual concepts in natural images), offer a principled way of encoding our perceptions about the physical world, and enable the human-like generalization of machine learning systems. The predominant paradigm for learning structured representations of the observed data has been supervised learning, but it is limited in several respects. First, supervised learning is challenging given the scarcity of labeled data. Second, conventional approaches to structured prediction have been relying on a single modality (e.g., either images or text), ignoring the learning cues that may have been specified in and can be readily obtained from other modalities of data. In this thesis, we investigate unsupervised approaches to structure induction in a multimodal setting.
Unsupervised learning is inherently difficult in general, let alone inducing complex and discrete structures from data without direct supervision. By considering the multimodal setting, we leverage the alignments between different data modalities (e.g., text, audio, and images) to facilitate the learning of structure-induction models, e.g., knowing that the individual words in ``a white pigeon'' always appear with the same visual object, a language parser is likely to treat them as a whole (i.e., phrase). The multimodal learning setting is practically viable because multimodal alignments are generally abundant. For example, they can be found in online posts such as news and tweets that usually contain images and associated text, and in (YouTube) videos, where audio, scripts, and scenes are synchronized and grounded in each other.
We develop structure-induction models, which are capable of exploiting bimodal image-text alignments, for two modalities: (1) for natural language, we consider unsupervised syntactic parsing with phrase-structure grammars and regularize the parser by using visual image groundings; and (2) for visual images, we induce scene graph representations by mapping arguments and predicates in the text to their visual counterparts (i.e., visual objects and relations among them) in an unsupervised manner. While useful, crossmodal alignments are not always abundantly available on the web, e.g., the alignments between non-speech audio and text. We tackle the challenge by sharing the visual modality between image-text alignment and image-audio alignment; images function as a pivot and connect audio and text. The contributions of this thesis span from model development to data collection. We demonstrated the feasibility of applying multimodal learning techniques to unsupervised structure induction and multimodal alignment collection. Our work opens up new avenues for multimodal and unsupervised structured representation learning
Knowledge-enhanced neural grammar Induction
Natural language is usually presented as a word sequence, but the inherent structure
of language is not necessarily sequential. Automatic grammar induction for natural
language is a long-standing research topic in the field of computational linguistics and
still remains an open problem today. From the perspective of cognitive science, the
goal of a grammar induction system is to mimic children: learning a grammar that can
generalize to infinitely many utterances by only consuming finite data. With regard to
computational linguistics, an automatic grammar induction system could be beneficial
for a wide variety of natural language processing (NLP) applications: providing syntactic analysis explicitly for a pipeline or a joint learning system; injecting structural
bias implicitly into an end-to-end model.
Typically, approaches to grammar induction only have access to raw text. Due to
the huge search space of trees as well as data sparsity and ambiguity issues, grammar
induction is a difficult problem. Thanks to the rapid development of neural networks
and their capacity of over-parameterization and continuous representation learning,
neural models have been recently introduced to grammar induction. Given its large
capacity, introducing external knowledge into a neural system is an effective approach
in practice, especially for an unsupervised problem. This thesis explores how to incorporate external knowledge into neural grammar induction models. We develop several approaches to combine different types of knowledge with neural grammar induction models on two grammar formalisms â constituency and dependency grammar.
We first investigate how to inject symbolic knowledge, universal linguistic rules,
into unsupervised dependency parsing. In contrast to previous state-of-the-art models that utilize time-consuming global inference, we propose a neural transition-based
parser using variational inference. Our parser is able to employ rich features and supports inference in linear time for both training and testing. The core component in our parser is posterior regularization, where the posterior distribution of the dependency trees is constrained by the universal linguistic rules. The resulting parser outperforms previous unsupervised transition-based dependency parsers and achieves performance comparable to global inference-based models. Our parser also substantially increases parsing speed over global inference-based models.
Recently, tree structures have been considered as latent variables that are learned
through downstream NLP tasks, such as language modeling and natural language inference. More specifically, auxiliary syntax-aware components are embedded into the
neural networks and are trained end-to-end on the downstream tasks. However, such latent tree models either struggle to produce linguistically plausible tree structures, or require an external biased parser to obtain good parsing performance. In the second part of this thesis, we focus on constituency structure and propose to use imitation learning to couple two heterogeneous latent tree models: we transfer the knowledge learned from a continuous latent tree model trained using language modeling to a discrete one, and further fine-tune the discrete model using a natural language inference objective.
Through this two-stage training scheme, the discrete latent tree model achieves stateof-the-art unsupervised parsing performance.
The transformer is a newly proposed neural model for NLP. Transformer-based
pre-trained language models (PLMs) like BERT have achieved remarkable success on
various NLP tasks by training on an enormous corpus using word prediction tasks. Recent studies show that PLMs can learn considerable syntactical knowledge in a syntaxagnostic manner. In the third part of this thesis, we leverage PLMs as a source of
external knowledge. We propose a parameter-free approach to select syntax-sensitive
self-attention heads from PLMs and perform chart-based unsupervised constituency
parsing. In contrast to previous approaches, our head-selection approach only relies
on raw text without any annotated development data. Experimental results on both
English and eight other languages show that our approach achieves competitive performance
New resources and ideas for semantic parser induction
In this thesis, we investigate the general topic of computational natural language understanding (NLU), which has as its goal the development of algorithms and other computational methods that support reasoning about natural language by the computer. Under the classical approach, NLU models work similar to computer compilers (Aho et al., 1986), and include as a central component a semantic parser that translates natural language input (i.e., the compilerâs high-level language) to lower-level formal languages that facilitate program execution and exact reasoning. Given the difficulty of building natural language compilers by hand, recent work has centered around semantic parser induction, or on using machine learning to learn semantic parsers and semantic representations from parallel data consisting of example text-meaning pairs (Mooney, 2007a).
One inherent difficulty in this data-driven approach is finding the parallel data needed to train the target semantic parsing models, given that such data does not occur naturally âin the wildâ (Halevy et al., 2009). Even when data is available, the amount of domain- and language-specific data and the nature of the available annotations might be insufficient for robust machine learning and capturing the full range of NLU phenomena. Given these underlying resource issues, the semantic parsing field is in constant need of new resources and datasets, as well as novel learning techniques and task evaluations that make models more robust and adaptable to the many applications that require reliable semantic parsing.
To address the main resource problem involving finding parallel data, we investigate the idea of using source code libraries, or collections of code and text documentation, as a parallel corpus for semantic parser development and introduce 45 new datasets in this domain and a new and challenging text-to-code translation task. As a way of addressing the lack of domain- and language-specific parallel data, we then use these and other benchmark datasets to investigate training se- mantic parsers on multiple datasets, which helps semantic parsers to generalize across different domains and languages and solve new tasks such as polyglot decoding and zero-shot translation (i.e., translating over and between multiple natural and formal languages and unobserved language pairs). Finally, to address the issue of insufficient annotations, we introduce a new learning framework called learning from entailment that uses entailment information (i.e., high-level inferences about whether the meaning of one sentence follows from another) as a weak learning signal to train semantic parsers to reason about the holes in their analysis and learn improved semantic representations.
Taken together, this thesis contributes a wide range of new techniques and technical solutions to help build semantic parsing models with minimal amounts of training supervision and manual engineering effort, hence avoiding the resource issues described at the onset. We also introduce a diverse set of new NLU tasks for evaluating semantic parsing models, which we believe help to extend the scope and real world applicability of semantic parsing and computational NLU
- âŠ