2,096 research outputs found
A Survey of Paraphrasing and Textual Entailment Methods
Paraphrasing methods recognize, generate, or extract phrases, sentences, or
longer natural language expressions that convey almost the same information.
Textual entailment methods, on the other hand, recognize, generate, or extract
pairs of natural language expressions, such that a human who reads (and trusts)
the first element of a pair would most likely infer that the other element is
also true. Paraphrasing can be seen as bidirectional textual entailment and
methods from the two areas are often similar. Both kinds of methods are useful,
at least in principle, in a wide range of natural language processing
applications, including question answering, summarization, text generation, and
machine translation. We summarize key ideas from the two areas by considering
in turn recognition, generation, and extraction methods, also pointing to
prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of
Informatics, Athens University of Economics and Business, Greece, 201
A Deep Architecture for Semantic Parsing
Many successful approaches to semantic parsing build on top of the syntactic
analysis of text, and make use of distributional representations or statistical
models to match parses to ontology-specific queries. This paper presents a
novel deep learning architecture which provides a semantic parsing system
through the union of two neural models of language semantics. It allows for the
generation of ontology-specific queries from natural language statements and
questions without the need for parsing, which makes it especially suitable to
grammatically malformed or syntactically atypical text, such as tweets, as well
as permitting the development of semantic parsers for resource-poor languages.Comment: In Proceedings of the Semantic Parsing Workshop at ACL 2014
(forthcoming
Automatic Identification of False Friends in Parallel Corpora: Statistical and Semantic Approach
False friends are pairs of words in two languages that are perceived as
similar but have different meanings. We present an improved
algorithm for acquiring false friends from sentence-level aligned parallel corpus
based on statistical observations of words occurrences and co-occurrences
in the parallel sentences. The results are compared with an entirely semantic
measure for cross-lingual similarity between words based on using the Web
as a corpus through analyzing the words’ local contexts extracted from the
text snippets returned by searching in Google. The statistical and semantic
measures are further combined into an improved algorithm for identification
of false friends that achieves almost twice better results than previously
known algorithms. The evaluation is performed for identifying cognates
between Bulgarian and Russian but the proposed methods could be adopted
for other language pairs for which parallel corpora and bilingual glossaries
are available
- …