31 research outputs found
An Imitation Learning Approach to Unsupervised Parsing
Recently, there has been an increasing interest in unsupervised parsers that
optimize semantically oriented objectives, typically using reinforcement
learning. Unfortunately, the learned trees often do not match actual syntax
trees well. Shen et al. (2018) propose a structured attention mechanism for
language modeling (PRPN), which induces better syntactic structures but relies
on ad hoc heuristics. Also, their model lacks interpretability as it is not
grounded in parsing actions. In our work, we propose an imitation learning
approach to unsupervised parsing, where we transfer the syntactic knowledge
induced by the PRPN to a Tree-LSTM model with discrete parsing actions. Its
policy is then refined by Gumbel-Softmax training towards a semantically
oriented objective. We evaluate our approach on the All Natural Language
Inference dataset and show that it achieves a new state of the art in terms of
parsing -score, outperforming our base models, including the PRPN.Comment: ACL201
Weakly Supervised Reasoning by Neuro-Symbolic Approaches
Deep learning has largely improved the performance of various natural
language processing (NLP) tasks. However, most deep learning models are
black-box machinery, and lack explicit interpretation. In this chapter, we will
introduce our recent progress on neuro-symbolic approaches to NLP, which
combines different schools of AI, namely, symbolism and connectionism.
Generally, we will design a neural system with symbolic latent structures for
an NLP task, and apply reinforcement learning or its relaxation to perform
weakly supervised reasoning in the downstream task. Our framework has been
successfully applied to various tasks, including table query reasoning,
syntactic structure reasoning, information extraction reasoning, and rule
reasoning. For each application, we will introduce the background, our
approach, and experimental results.Comment: Compendium of Neurosymbolic Artificial Intelligence, 665--692, 2023,
IOS Pres
Tree Transformer: Integrating Tree Structures into Self-Attention
Pre-training Transformer from large-scale raw texts and fine-tuning on the
desired task have achieved state-of-the-art results on diverse NLP tasks.
However, it is unclear what the learned attention captures. The attention
computed by attention heads seems not to match human intuitions about
hierarchical structures. This paper proposes Tree Transformer, which adds an
extra constraint to attention heads of the bidirectional Transformer encoder in
order to encourage the attention heads to follow tree structures. The tree
structures can be automatically induced from raw texts by our proposed
"Constituent Attention" module, which is simply implemented by self-attention
between two adjacent words. With the same training procedure identical to BERT,
the experiments demonstrate the effectiveness of Tree Transformer in terms of
inducing tree structures, better language modeling, and further learning more
explainable attention scores.Comment: accepted by EMNLP 201