7,272 research outputs found
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Question Paraphrase Generation for Question Answering System
The queries to a practical Question Answering (QA) system range from keywords, phrases, badly written questions, and occasionally grammatically perfect questions. Among different kinds of question analysis approaches, the pattern matching works well in analyzing such queries. It is costly to build this pattern matching module because tremendous manual labor is needed to expand its coverage to so many variations in natural language questions. This thesis proposes that the costly manual labor should be saved by the technique of paraphrase generation which can automatically generate semantically similar paraphrases of a natural language question. Previous approaches of paraphrase generation either require large scale of corpus and the dependency parser, or only deal with the relation-entity type of simple question queries. By introducing a method of inferring transformation operations between paraphrases, and a description of sentence structure, this thesis develops a paraphrase generation method and its implementation in Chinese with very limited amount of corpus. The evaluation results of this implementation show its ability to aid humans to efficiently create a pattern matching module for QA systems as it greatly outperforms the human editors in the coverage of natural language questions, with an acceptable precision in generated paraphrases
Learning Semantic Correspondences in Technical Documentation
We consider the problem of translating high-level textual descriptions to
formal representations in technical documentation as part of an effort to model
the meaning of such documentation. We focus specifically on the problem of
learning translational correspondences between text descriptions and grounded
representations in the target documentation, such as formal representation of
functions or code templates. Our approach exploits the parallel nature of such
documentation, or the tight coupling between high-level text and the low-level
representations we aim to learn. Data is collected by mining technical
documents for such parallel text-representation pairs, which we use to train a
simple semantic parsing model. We report new baseline results on sixteen novel
datasets, including the standard library documentation for nine popular
programming languages across seven natural languages, and a small collection of
Unix utility manuals.Comment: accepted to ACL-201
- …