Search CORE

9,463 research outputs found

A Survey of Paraphrasing and Textual Entailment Methods

Author: Androutsopoulos Ion
Malakasiotis Prodromos
Publication venue: 'AI Access Foundation'
Publication date: 30/05/2010
Field of study

Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201

arXiv.org e-Print Archive

Crossref

Supporting collocation learning with a digital library

Author: Franken Margaret
Witten Ian H.
Wu Shaoqun
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

Extensive knowledge of collocations is a key factor that distinguishes learners from fluent native speakers. Such knowledge is difficult to acquire simply because there is so much of it. This paper describes a system that exploits the facilities offered by digital libraries to provide a rich collocation-learning environment. The design is based on three processes that have been identified as leading to lexical acquisition: noticing, retrieval and generation. Collocations are automatically identified in input documents using natural language processing techniques and used to enhance the presentation of the documents and also as the basis of exercises, produced under teacher control, that amplify students' collocation knowledge. The system uses a corpus of 1.3 B short phrases drawn from the web, from which 29 M collocations have been automatically identified. It also connects to examples garnered from the live web and the British National Corpus

Research Commons@Waikato

Automatic Description: A Novel Approach to Documenting Character Description for Consistency in Long – Form Prose

Author: Akulick Samantha
Publication venue: SOURCE: Sheridan Institutional Repository
Publication date: 01/12/2019
Field of study

Currently, continuity editing for narrative fiction is performed manually. Many hours of human effort are required to comb through written works for inconsistencies. This study investigates the use of syntactic patterns of descriptions in narrative text and subject identification techniques like named entity recognition (NER) and coreferent resolution in narrative text as a step toward automated continuity analysis. This investigation involved examining natural English language to identify patterns used in descriptions and using natural language processing (NLP) techniques to identify those patterns and sentence subjects programmatically. Results were assessed by using the content of well-known works of fiction and two algorithms developed to identify sentence subjects and descriptions, to promising results. With the fragmented, iterative cycle of writing long-form prose and the limitations of human memory and reading speed, maintaining a clear and consistent image of a character\u27s appearance and personality is a difficult task for human authors and editors to complete manually. The results of this research provide a starting point to automate and improve the process writing and proofreading narrative works

SOURCE: Sheridan Scholarly Output Undergraduate Research Creative Excellence