2,296 research outputs found
Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations
Syntactically controlled paraphrase generation has become an emerging
research direction in recent years. Most existing approaches require annotated
paraphrase pairs for training and are thus costly to extend to new domains.
Unsupervised approaches, on the other hand, do not need paraphrase pairs but
suffer from relatively poor performance in terms of syntactic control and
quality of generated paraphrases. In this paper, we demonstrate that leveraging
Abstract Meaning Representations (AMR) can greatly improve the performance of
unsupervised syntactically controlled paraphrase generation. Our proposed
model, AMR-enhanced Paraphrase Generator (AMRPG), separately encodes the AMR
graph and the constituency parse of the input sentence into two disentangled
semantic and syntactic embeddings. A decoder is then learned to reconstruct the
input sentence from the semantic and syntactic embeddings. Our experiments show
that AMRPG generates more accurate syntactically controlled paraphrases, both
quantitatively and qualitatively, compared to the existing unsupervised
approaches. We also demonstrate that the paraphrases generated by AMRPG can be
used for data augmentation to improve the robustness of NLP models.Comment: Paper accepted by EMNLP 2022 Findings. The first two authors
contribute equall
Hierarchical Sketch Induction for Paraphrase Generation
We propose a generative model of paraphrase generation, that encourages
syntactic diversity by conditioning on an explicit syntactic sketch. We
introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE),
a method for learning decompositions of dense encodings as a sequence of
discrete latent variables that make iterative refinements of increasing
granularity. This hierarchy of codes is learned through end-to-end training,
and represents fine-to-coarse grained information about the input. We use
HRQ-VAE to encode the syntactic form of an input sentence as a path through the
hierarchy, allowing us to more easily predict syntactic sketches at test time.
Extensive experiments, including a human evaluation, confirm that HRQ-VAE
learns a hierarchical representation of the input space, and generates
paraphrases of higher quality than previous systems.Comment: Accepted at ACL 202
Evaluating Large Language Models on Controlled Generation Tasks
While recent studies have looked into the abilities of large language models
in various benchmark tasks, including question generation, reading
comprehension, multilingual and etc, there have been few studies looking into
the controllability of large language models on generation tasks. We present an
extensive analysis of various benchmarks including a sentence planning
benchmark with different granularities. After comparing large language models
against state-of-the-start finetuned smaller models, we present a spectrum
showing large language models falling behind, are comparable, or exceed the
ability of smaller models. We conclude that **large language models struggle at
meeting fine-grained hard constraints**.Comment: EMNLP 202
Is it Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models
Text simplification is a common task where the text is adapted to make it
easier to understand. Similarly, text elaboration can make a passage more
sophisticated, offering a method to control the complexity of reading
comprehension tests. However, text simplification and elaboration tasks are
limited to only relatively alter the readability of texts. It is useful to
directly modify the readability of any text to an absolute target readability
level to cater to a diverse audience. Ideally, the readability of
readability-controlled generated text should be independent of the source text.
Therefore, we propose a novel readability-controlled text modification task.
The task requires the generation of 8 versions at various target readability
levels for each input text. We introduce novel readability-controlled text
modification metrics. The baselines for this task use ChatGPT and Llama-2, with
an extension approach introducing a two-step process (generating paraphrases by
passing through the language model twice). The zero-shot approaches are able to
push the readability of the paraphrases in the desired direction but the final
readability remains correlated with the original text's readability. We also
find greater drops in semantic and lexical similarity between the source and
target texts with greater shifts in the readability.Comment: 11 pages, 4 figures, 5 table
- …