Search CORE

687 research outputs found

Techniques for recognizing textual entailment and semantic equivalence

Author: Herrera de la Cruz Jesús
Peñas Anselmo
Verdejo Felisa
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

After defining what is understood by textual entailment and semantic equivalence, the present state and the desirable future of the systems aimed at recognizing them is shown. A compilation of the currently implemented techniques in the main Recognizing Textual Entailment and Semantic Equivalence systems is given

Docta Complutense

A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge

Author: Coote Ravi
Wotzlaw Andreas
Publication venue
Publication date: 18/10/2013
Field of study

We present the architecture and the evaluation of a new system for recognizing textual entailment (RTE). In RTE we want to identify automatically the type of a logical relation between two input texts. In particular, we are interested in proving the existence of an entailment between them. We conceive our system as a modular environment allowing for a high-coverage syntactic and semantic text analysis combined with logical inference. For the syntactic and semantic analysis we combine a deep semantic analysis with a shallow one supported by statistical models in order to increase the quality and the accuracy of results. For RTE we use logical inference of first-order employing model-theoretic techniques and automated reasoning tools. The inference is supported with problem-relevant background knowledge extracted automatically and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or other, more experimental sources with, e.g., manually defined presupposition resolutions, or with axiomatized general and common sense knowledge. The results show that fine-grained and consistent knowledge coming from diverse sources is a necessary condition determining the correctness and traceability of results.Comment: 25 pages, 10 figure

arXiv.org e-Print Archive

CiteSeerX

DLSITE-1: lexical analysis for solving textual entailment recognition

Author: Ferrández Escámez Óscar
Micol Ponce Daniel
Muñoz Rafael
Palomar Manuel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

This paper discusses the recognition of textual entailment in a text-hypothesis pair by applying a wide variety of lexical measures. We consider that the entailment phenomenon can be tackled from three general levels: lexical, syntactic and semantic. The main goals of this research are to deal with this phenomenon from a lexical point of view, and achieve high results considering only such kind of knowledge. To accomplish this, the information provided by the lexical measures is used as a set of features for a Support Vector Machine which will decide if the entailment relation is produced. A study of the most relevant features and a comparison with the best state-of-the-art textual entailment systems is exposed throughout the paper. Finally, the system has been evaluated using the Second PASCAL Recognising Textual Entailment Challenge data and evaluation methodology, obtaining an accuracy rate of 61.88%.QALL-ME consortium, 6º Programa Marco, Unión Europea, referencia del proyecto FP6-IST-033860. Gobierno de España, proyecto CICyT número TIN2006-1526-C06-01

Repositorio Institucional de la Universidad de Alicante

A Continuously Growing Dataset of Sentential Paraphrases

Author: He Hua
Lan Wuwei
Qiu Siyu
Xu Wei
Publication venue
Publication date: 01/01/2017
Field of study

A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. In addition, we show that more than 30,000 new sentential paraphrases can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

A resource-light method for cross-lingual semantic textual similarity

Author: Agirre
Agirre
Agirre
Artetxe
Aziz
Bowman
Brychcín
Bär
Castillo
Dagan
De Schryver
Dinu
Efron
Franco-Salvador
Franco-Salvador
Franco-Salvador
Goran Glavaš
Han
Hill
Hänig
Islam
Jimenez
Kashyap
Kuhn
Ljubešić
Lopez-Gazpio
Madnani
Marc Franco-Salvador
Mehdad
Mikolov
Mikolov
Navigli
Negri
Och
Oliva
Paolo Rosso
Pennington
Potthast
Potthast
Potthast
Resnik
Simone P. Ponzetto
Smith
Socher
Steinberger
Sultan
Turchi
Vulić
Yih
Šarić
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

[EN] Recognizing semantically similar sentences or paragraphs across languages is beneficial for many tasks, ranging from cross-lingual information retrieval and plagiarism detection to machine translation. Recently proposed methods for predicting cross-lingual semantic similarity of short texts, however, make use of tools and resources (e.g., machine translation systems, syntactic parsers or named entity recognition) that for many languages (or language pairs) do not exist. In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages. To operate in the bilingual (or multilingual) space, we project continuous word vectors (i.e., word embeddings) from one language to the vector space of the other language via the linear translation model. We then align words according to the similarity of their vectors in the bilingual embedding space and investigate different unsupervised measures of semantic similarity exploiting bilingual embeddings and word alignments. Requiring only a limited-size set of word translation pairs between the languages, the proposed approach is applicable to virtually any pair of languages for which there exists a sufficiently large corpus, required to learn monolingual word embeddings. Experimental results on three different datasets for measuring semantic textual similarity show that our simple resource-light approach reaches performance close to that of supervised and resource-intensive methods, displaying stability across different language pairs. Furthermore, we evaluate the proposed method on two extrinsic tasks, namely extraction of parallel sentences from comparable corpora and cross-lingual plagiarism detection, and show that it yields performance comparable to those of complex resource-intensive state-of-the-art models for the respective tasks. (C) 2017 Published by Elsevier B.V.Part of the work presented in this article was performed during second author's research visit to the University of Mannheim, supported by Contact Fellowship awarded by the DAAD scholarship program "STIBET Doktoranden". The research of the last author has been carried out in the framework of the SomEMBED project (TIN2015-71147-C2-1-P). Furthermore, this work was partially funded by the Junior-professor funding programme of the Ministry of Science, Research and the Arts of the state of Baden-Wurttemberg (project "Deep semantic models for high-end NLP application").Glavas, G.; Franco-Salvador, M.; Ponzetto, SP.; Rosso, P. (2018). A resource-light method for cross-lingual semantic textual similarity. Knowledge-Based Systems. 143:1-9. https://doi.org/10.1016/j.knosys.2017.11.041S1914

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

RiuNet

A perspective-based approach for solving textual entailment recognition

Author: Ferrández Escámez Óscar
Micol Ponce Daniel
Muñoz Rafael
Palomar Manuel
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

The textual entailment recognition system that we discuss in this paper represents a perspective-based approach composed of two modules that analyze text-hypothesis pairs from a strictly lexical and syntactic perspectives, respectively. We attempt to prove that the textual entailment recognition task can be overcome by performing individual analysis that acknowledges us of the maximum amount of information that each single perspective can provide. We compare this approach with the system we presented in the previous edition of PASCAL Recognising Textual Entailment Challenge, obtaining an accuracy rate 17.98% higher.QALL-ME consortium, 6º Programa Marco, Unión Europea, referencia del proyecto FP6-IST-033860. Gobierno de España, proyecto CICyT número TIN2006-1526-C06-01. Generalitat Valenciana, proyecto ACOM06/90

Repositorio Institucional de la Universidad de Alicante

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Towards an automatic validation of answers in Question Answering

Author: Grappy Arnaud
Grau Brigitte
Ligozat Anne-Laure
Robba Isabelle
Vilnat Anne
Publication venue: HAL CCSD
Publication date: 01/10/2007
Field of study

International audienceQuestion answering (QA) aims at retrieving precise information from a large collection of documents. Different techniques can be used to find relevant information, and to compare these techniques, it is important to evaluate QA systems. The objective of an Answer Validation task is thus to judge the correctness of an answer returned by a QA system for a question, according to the text snippet given to support it. We participated in such a task in 2006. In this article, we present our strategy for deciding if the snippets justify the answers: a strategy based on our own QA system, comparing the answers it returned with the answer to judge. We discuss our results, then we point out the difficulties of this task