317 research outputs found
Unsupervised Chunking with Hierarchical RNN
In Natural Language Processing (NLP), predicting linguistic structures, such
as parsing and chunking, has mostly relied on manual annotations of syntactic
structures. This paper introduces an unsupervised approach to chunking, a
syntactic task that involves grouping words in a non-hierarchical manner. We
present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to
model word-to-chunk and chunk-to-sentence compositions. Our approach involves a
two-stage training process: pretraining with an unsupervised parser and
finetuning on downstream NLP tasks. Experiments on the CoNLL-2000 dataset
reveal a notable improvement over existing unsupervised methods, enhancing
phrase F1 score by up to 6 percentage points. Further, finetuning with
downstream tasks results in an additional performance improvement.
Interestingly, we observe that the emergence of the chunking structure is
transient during the neural model's downstream-task training. This study
contributes to the advancement of unsupervised syntactic structure discovery
and opens avenues for further research in linguistic theory
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing
The recent prevalence of pretrained language models (PLMs) has dramatically
shifted the paradigm of semantic parsing, where the mapping from natural
language utterances to structured logical forms is now formulated as a Seq2Seq
task. Despite the promising performance, previous PLM-based approaches often
suffer from hallucination problems due to their negligence of the structural
information contained in the sentence, which essentially constitutes the key
semantics of the logical forms. Furthermore, most works treat PLM as a black
box in which the generation process of the target logical form is hidden
beneath the decoder modules, which greatly hinders the model's intrinsic
interpretability. To address these two issues, we propose to incorporate the
current PLMs with a hierarchical decoder network. By taking the first-principle
structures as the semantic anchors, we propose two novel intermediate
supervision tasks, namely Semantic Anchor Extraction and Semantic Anchor
Alignment, for training the hierarchical decoders and probing the model
intermediate representations in a self-adaptive manner alongside the
fine-tuning process. We conduct intensive experiments on several semantic
parsing benchmarks and demonstrate that our approach can consistently
outperform the baselines. More importantly, by analyzing the intermediate
representations of the hierarchical decoders, our approach also makes a huge
step toward the intrinsic interpretability of PLMs in the domain of semantic
parsing
Semantic Parsing in Limited Resource Conditions
This thesis explores challenges in semantic parsing, specifically focusing on
scenarios with limited data and computational resources. It offers solutions
using techniques like automatic data curation, knowledge transfer, active
learning, and continual learning.
For tasks with no parallel training data, the thesis proposes generating
synthetic training examples from structured database schemas. When there is
abundant data in a source domain but limited parallel data in a target domain,
knowledge from the source is leveraged to improve parsing in the target domain.
For multilingual situations with limited data in the target languages, the
thesis introduces a method to adapt parsers using a limited human translation
budget. Active learning is applied to select source-language samples for manual
translation, maximizing parser performance in the target language. In addition,
an alternative method is also proposed to utilize machine translation services,
supplemented by human-translated data, to train a more effective parser.
When computational resources are limited, a continual learning approach is
introduced to minimize training time and computational memory. This maintains
the parser's efficiency in previously learned tasks while adapting it to new
tasks, mitigating the problem of catastrophic forgetting.
Overall, the thesis provides a comprehensive set of methods to improve
semantic parsing in resource-constrained conditions.Comment: PhD thesis, year of award 2023, 172 page
Privacy-Preserving Domain Adaptation of Semantic Parsers
Task-oriented dialogue systems often assist users with personal or
confidential matters. For this reason, the developers of such a system are
generally prohibited from observing actual usage. So how can they know where
the system is failing and needs more training data or new functionality? In
this work, we study ways in which realistic user utterances can be generated
synthetically, to help increase the linguistic and functional coverage of the
system, without compromising the privacy of actual users. To this end, we
propose a two-stage Differentially Private (DP) generation method which first
generates latent semantic parses, and then generates utterances based on the
parses. Our proposed approach improves MAUVE by 2.5 and parse tree
function type overlap by 1.3 relative to current approaches for private
synthetic data generation, improving both on fluency and semantic coverage. We
further validate our approach on a realistic domain adaptation task of adding
new functionality from private user data to a semantic parser, and show overall
gains of 8.5% points in accuracy with the new feature.Comment: ACL 202
Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation
This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (
; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase
- …