12,815 research outputs found
ParaMetric: An Automatic Evaluation Metric for Paraphrasing
We present ParaMetric, an automatic evaluation metric for data-driven approaches to paraphrasing. ParaMetric provides an objective measure of quality using a collection of multiple translations whose paraphrases have been manually annotated. ParaMetric calculates precision and recall scores by comparing the paraphrases discovered by automatic paraphrasing techniques against gold standard alignments of words and phrases within equivalent sentences. We report scores for several established paraphrasing techniques.
Better Summarization Evaluation with Word Embeddings for ROUGE
ROUGE is a widely adopted, automatic evaluation measure for text
summarization. While it has been shown to correlate well with human judgements,
it is biased towards surface lexical similarities. This makes it unsuitable for
the evaluation of abstractive summarization, or summaries with substantial
paraphrasing. We study the effectiveness of word embeddings to overcome this
disadvantage of ROUGE. Specifically, instead of measuring lexical overlaps,
word embeddings are used to compute the semantic similarity of the words used
in summaries instead. Our experimental results show that our proposal is able
to achieve better correlations with human judgements when measured with the
Spearman and Kendall rank coefficients.Comment: Pre-print - To appear in proceedings of the Conference on Empirical
Methods in Natural Language Processing (EMNLP
Unsupervised Paraphrasing via Deep Reinforcement Learning
Paraphrasing is expressing the meaning of an input sentence in different
wording while maintaining fluency (i.e., grammatical and syntactical
correctness). Most existing work on paraphrasing use supervised models that are
limited to specific domains (e.g., image captions). Such models can neither be
straightforwardly transferred to other domains nor generalize well, and
creating labeled training data for new domains is expensive and laborious. The
need for paraphrasing across different domains and the scarcity of labeled
training data in many such domains call for exploring unsupervised paraphrase
generation methods. We propose Progressive Unsupervised Paraphrasing (PUP): a
novel unsupervised paraphrase generation method based on deep reinforcement
learning (DRL). PUP uses a variational autoencoder (trained using a
non-parallel corpus) to generate a seed paraphrase that warm-starts the DRL
model. Then, PUP progressively tunes the seed paraphrase guided by our novel
reward function which combines semantic adequacy, language fluency, and
expression diversity measures to quantify the quality of the generated
paraphrases in each iteration without needing parallel sentences. Our extensive
experimental evaluation shows that PUP outperforms unsupervised
state-of-the-art paraphrasing techniques in terms of both automatic metrics and
user studies on four real datasets. We also show that PUP outperforms
domain-adapted supervised algorithms on several datasets. Our evaluation also
shows that PUP achieves a great trade-off between semantic similarity and
diversity of expression
Visual Information Guided Zero-Shot Paraphrase Generation
Zero-shot paraphrase generation has drawn much attention as the large-scale
high-quality paraphrase corpus is limited. Back-translation, also known as the
pivot-based method, is typical to this end. Several works leverage different
information as "pivot" such as language, semantic representation and so on. In
this paper, we explore using visual information such as image as the "pivot" of
back-translation. Different with the pipeline back-translation method, we
propose visual information guided zero-shot paraphrase generation (ViPG) based
only on paired image-caption data. It jointly trains an image captioning model
and a paraphrasing model and leverage the image captioning model to guide the
training of the paraphrasing model. Both automatic evaluation and human
evaluation show our model can generate paraphrase with good relevancy, fluency
and diversity, and image is a promising kind of pivot for zero-shot paraphrase
generation.Comment: Accepted By COLING 202
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Sentence Similarity and Machine Translation
Neural machine translation (NMT) systems encode an input sentence into an intermediate representation and then decode that representation into the output sentence. Translation requires deep understanding of language; as a result, NMT models trained on large amounts of data develop a semantically rich intermediate representation.
We leverage this rich intermediate representation of NMT systemsâin particular, multilingual NMT systems, which learn to map many languages into and out of a joint spaceâfor bitext curation, paraphrasing, and automatic machine translation (MT) evaluation. At a high level, all of these tasks are rooted in similarity: sentence and document alignment requires measuring similarity of sentences and documents, respectively; paraphrasing requires producing output which is similar to an input; and automatic MT evaluation requires measuring the similarity between MT system outputs and corresponding human reference translations.
We use multilingual NMT for similarity in two ways: First, we use a multilingual NMT model with a fixed-size intermediate representation (Artetxe and Schwenk, 2018) to produce multilingual sentence embeddings, which we use in both sentence and document alignment. Second, we train a multilingual NMT model and show that it generalizes to the task of generative paraphrasing (i.e., âtranslatingâ from Russian to Russian), when used in conjunction with a simple generation algorithm to discourage copying from the input to the output. We also use this model for automatic MT evaluation, to force decode and score MT system outputs conditioned on their respective human reference translations. Since we leverage multilingual NMT models, each method works in many languages using a single model.
We show that simple methods, which leverage the intermediate representation of multilingual NMT models trained on large amounts of bitext, outperform prior work in paraphrasing, sentence alignment, document alignment, and automatic MT evaluation. This finding is consistent with recent trends in the natural language processing community, where large language models trained on huge amounts of unlabeled text have achieved state-of-the-art results on tasks such as question answering, named entity recognition, and parsing
Paraphrase Generation with Deep Reinforcement Learning
Automatic generation of paraphrases from a given sentence is an important yet
challenging task in natural language processing (NLP), and plays a key role in
a number of applications such as question answering, search, and dialogue. In
this paper, we present a deep reinforcement learning approach to paraphrase
generation. Specifically, we propose a new framework for the task, which
consists of a \textit{generator} and an \textit{evaluator}, both of which are
learned from data. The generator, built as a sequence-to-sequence learning
model, can produce paraphrases given a sentence. The evaluator, constructed as
a deep matching model, can judge whether two sentences are paraphrases of each
other. The generator is first trained by deep learning and then further
fine-tuned by reinforcement learning in which the reward is given by the
evaluator. For the learning of the evaluator, we propose two methods based on
supervised learning and inverse reinforcement learning respectively, depending
on the type of available training data. Empirical study shows that the learned
evaluator can guide the generator to produce more accurate paraphrases.
Experimental results demonstrate the proposed models (the generators)
outperform the state-of-the-art methods in paraphrase generation in both
automatic evaluation and human evaluation.Comment: EMNLP 201
- âŠ