Search CORE

3 research outputs found

Determining Semantic Textual Similarity using Natural Deduction Proofs

Author: Bekki Daisuke
Martinez-Gomez Pascual
Mineshima Koji
Yanaka Hitomi
Publication venue
Publication date: 27/07/2017
Field of study

Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higher-order automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logic-based systems and that features derived from the proofs are effective for learning textual similarity.Comment: 11 pages, 5 figures, accepted as long paper of EMNLP201

arXiv.org e-Print Archive

Acquisition of Phrase Correspondences using Natural Deduction Proofs

Author: Bekki Daisuke
Martinez-Gomez Pascual
Mineshima Koji
Yanaka Hitomi
Publication venue
Publication date: 20/04/2018
Field of study

How to identify, extract, and use phrasal knowledge is a crucial problem for the task of Recognizing Textual Entailment (RTE). To solve this problem, we propose a method for detecting paraphrases via natural deduction proofs of semantic relations between sentence pairs. Our solution relies on a graph reformulation of partial variable unifications and an algorithm that induces subgraph alignments between meaning representations. Experiments show that our method can automatically detect various paraphrases that are absent from existing paraphrase databases. In addition, the detection of paraphrases using proof information improves the accuracy of RTE tasks.Comment: 11 pages, 4 figures, accepted as long paper of NAACL HLT 201

arXiv.org e-Print Archive

Semantic Textual Similarity on Brazilian Portuguese: An approach based on language-mixture models

Author: Bertoldi Luiz Ricardo
Bure Vladimir M.
Lozkins Aleksejs
Rigo Sandro
Silva Allan
Publication venue: 'Saint Petersburg State University'
Publication date: 01/06/2019
Field of study

The literature describes the Semantic Textual Similarity (STS) area as a fundamental part of many Natural Language Processing (NLP) tasks. The STS approaches are dependent on the availability of lexical-semantic resources. There are several efforts to improve the lexicalsemantics resources for the English language, and the state-of-art report a large amount of application for this language. Brazilian Portuguese linguistics resources, when compared with English ones, do not have the same availability regarding relation and contents, generation a loss of precision in STS tasks. Therefore, the current work presents an approach that combines Brazilian Portuguese and English lexical-semantics ontology resources to reach all potential of both language linguistic relations, to generate a language-mixture model to measure STS. We evaluated the proposed approach with a well-known and respected Brazilian Portuguese STS dataset, which brought to light some considerations about mixture models and their relations with ontology language semantics

Saint Petersburg State University