3 research outputs found
Determining Semantic Textual Similarity using Natural Deduction Proofs
Determining semantic textual similarity is a core research subject in natural
language processing. Since vector-based models for sentence representation
often use shallow information, capturing accurate semantics is difficult. By
contrast, logical semantic representations capture deeper levels of sentence
semantics, but their symbolic nature does not offer graded notions of textual
similarity. We propose a method for determining semantic textual similarity by
combining shallow features with features extracted from natural deduction
proofs of bidirectional entailment relations between sentence pairs. For the
natural deduction proofs, we use ccg2lambda, a higher-order automatic inference
system, which converts Combinatory Categorial Grammar (CCG) derivation trees
into semantic representations and conducts natural deduction proofs.
Experiments show that our system was able to outperform other logic-based
systems and that features derived from the proofs are effective for learning
textual similarity.Comment: 11 pages, 5 figures, accepted as long paper of EMNLP201
Acquisition of Phrase Correspondences using Natural Deduction Proofs
How to identify, extract, and use phrasal knowledge is a crucial problem for
the task of Recognizing Textual Entailment (RTE). To solve this problem, we
propose a method for detecting paraphrases via natural deduction proofs of
semantic relations between sentence pairs. Our solution relies on a graph
reformulation of partial variable unifications and an algorithm that induces
subgraph alignments between meaning representations. Experiments show that our
method can automatically detect various paraphrases that are absent from
existing paraphrase databases. In addition, the detection of paraphrases using
proof information improves the accuracy of RTE tasks.Comment: 11 pages, 4 figures, accepted as long paper of NAACL HLT 201
Semantic Textual Similarity on Brazilian Portuguese: An approach based on language-mixture models
The literature describes the Semantic Textual Similarity (STS) area as a fundamental part of many Natural Language Processing (NLP) tasks. The STS approaches are dependent on the availability of lexical-semantic resources. There are several efforts to improve the lexicalsemantics resources for the English language, and the state-of-art report a large amount of application for this language. Brazilian Portuguese linguistics resources, when compared with English ones, do not have the same availability regarding relation and contents, generation a loss of precision in STS tasks. Therefore, the current work presents an approach that combines Brazilian Portuguese and English lexical-semantics ontology resources to reach all potential of both language linguistic relations, to generate a language-mixture model to measure STS. We evaluated the proposed approach with a well-known and respected Brazilian Portuguese STS dataset, which brought to light some considerations about mixture models and their relations with ontology language semantics