2,845 research outputs found
Semantic analysis for paraphrase identification using semantic role labeling
Reuse of documents has been prominently appeared during the course of digitalization of information contents owing to the wide-spread of internet and smartphones in various complex forms such as inserting words, omitting and substituting, changing word order, and etc. Especially, when a word in document is substituted with a similar word, it would be an issue not to consider it as a subject of measurement for the existing morphological similarity measurement method. In order to resolve this kind of problem, various researches have been conducted on the similarity measurement considering semantic information. This study is to propose a measurement method on semantic similarity being characterized as semantic role information in sentences acquired by semantic role labeling. To assess the performance of this proposed method, it was compared with the method of substring similarity being utilized for similarity measurement for existing documents. As a result, we could identify that the proposed method performed similar with the conventional method for the plagiarized documents which were rarely modified whereas it had improved results for paraphrasing sentences which were changed in structure
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
Crowdsourcing Question-Answer Meaning Representations
We introduce Question-Answer Meaning Representations (QAMRs), which represent
the predicate-argument structure of a sentence as a set of question-answer
pairs. We also develop a crowdsourcing scheme to show that QAMRs can be labeled
with very little training, and gather a dataset with over 5,000 sentences and
100,000 questions. A detailed qualitative analysis demonstrates that the
crowd-generated question-answer pairs cover the vast majority of
predicate-argument relationships in existing datasets (including PropBank,
NomBank, QA-SRL, and AMR) along with many previously under-resourced ones,
including implicit arguments and relations. The QAMR data and annotation code
is made publicly available to enable future work on how best to model these
complex phenomena.Comment: 8 pages, 6 figures, 2 table
- …