Search CORE

487,932 research outputs found

Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization

Author: Torres-Moreno Juan-Manuel
Publication venue
Publication date: 14/09/2012
Field of study

In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the curse of dimensionality can disturb the performance of summarizers. This paper describes a new method for normalization of words to further reduce the space of representation. We propose to reduce each word to its initial letters, as a form of Ultra-stemming. The results show that Ultra-stemming not only preserve the content of summaries produced by this representation, but often the performances of the systems can be dramatically improved. Summaries on trilingual corpora were evaluated automatically with Fresa. Results confirm an increase in the performance, regardless of summarizer system used.Comment: 22 pages, 12 figures, 9 table

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

The Man Who Mistook His Neuropsychologist For a Popstar: When Configural Processing Fails in Acquired Prosopagnosia

Author: Cobb S
Hanley JR
Jansari A
Miller S
Pearce L
Sagiv N
Tree J
Williams AL
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

We report the case of an individual with acquired prosopagnosia who experiences extreme difficulties in recognizing familiar faces in everyday life despite excellent object recognition skills. Formal testing indicates that he is also severely impaired at remembering pre-experimentally unfamiliar faces and that he takes an extremely long time to identify famous faces and to match unfamiliar faces. Nevertheless, he performs as accurately and quickly as controls at identifying inverted familiar and unfamiliar faces and can recognize famous faces from their external features. He also performs as accurately as controls at recognizing famous faces when fracturing conceals the configural information in the face. He shows evidence of impaired global processing but normal local processing of Navon figures. This case appears to reflect the clearest example yet of an acquired prosopagnosic patient whose familiar face recognition deficit is caused by a severe configural processing deficit in the absence of any problems in featural processing. These preserved featural skills together with apparently intact visual imagery for faces allow him to identify a surprisingly large number of famous faces when unlimited time is available. The theoretical implications of this pattern of performance for understanding the nature of acquired prosopagnosia are discussed.DY, Avery Braun, Jacob Waite, and Nadine Wanke, Bruno Rossion, Thomas Busigny and the grant awarded by AJ by the Experimental Psychology Society (EPS

Brunel University Research Archive

Consecutive retrieval with redundancy: an optimal linear and an optimal cyclic arrangement and their storage space requirements

Author: Hoopen J. ten
Publication venue: Elsevier
Publication date: 01/01/1980
Field of study

Information retrieval, file organization, consecutive retrieval property, consecutive retrieval with redundancy, storage space requirements 1

CiteSeerX

University of Twente Research Information

Quantum cryptography: key distribution and beyond

Author: H. Akshata Shenoy
Pathak Anirban
Srikanth R.
Publication venue: 'Quanta'
Publication date: 15/02/2018
Field of study

Uniquely among the sciences, quantum cryptography has driven both foundational research as well as practical real-life applications. We review the progress of quantum cryptography in the last decade, covering quantum key distribution and other applications.Comment: It's a review on quantum cryptography and it is not restricted to QK

arXiv.org e-Print Archive

Handwriting styles: benchmarks and evaluation metrics

Author: Bailly Gerard
Mohammed Omar
Pellier Damien
Publication venue
Publication date: 04/09/2018
Field of study

Evaluating the style of handwriting generation is a challenging problem, since it is not well defined. It is a key component in order to develop in developing systems with more personalized experiences with humans. In this paper, we propose baseline benchmarks, in order to set anchors to estimate the relative quality of different handwriting style methods. This will be done using deep learning techniques, which have shown remarkable results in different machine learning tasks, learning classification, regression, and most relevant to our work, generating temporal sequences. We discuss the challenges associated with evaluating our methods, which is related to evaluation of generative models in general. We then propose evaluation metrics, which we find relevant to this problem, and we discuss how we evaluate the evaluation metrics. In this study, we use IRON-OFF dataset. To the best of our knowledge, there is no work done before in generating handwriting (either in terms of methodology or the performance metrics), our in exploring styles using this dataset.Comment: Submitted to IEEE International Workshop on Deep and Transfer Learning (DTL 2018

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

A magnetic stimulation examination of orthographic neighborhood effects in visual word recognition

Author: Lavidor M.
Walsh V.
Publication venue
Publication date: 01/04/2003
Field of study

The split-fovea theory proposes that visual word recognition is mediated by the splitting of the foveal image, with letters to the left of fixation projected to the right hemisphere (RH) and letters to the right of fixation projected to the left hemisphere (LH). We applied repetitive transcranial magnetic stimulation (rTMS) over the left and right occipital cortex during a lexical decision task to investigate the extent to which word recognition processes could be accounted for according to the split-fovea theory. Unilateral rTMS significantly impaired lexical decision latencies to centrally presented words, supporting the suggestion that foveal representation of words is split between the cerebral hemispheres rather than bilateral. Behaviorally, we showed that words that have many orthographic neighbors sharing the same initial letters ("lead neighbors") facilitated lexical decision more than words with few lead neighbors. This effect did not apply to end neighbors (orthographic neighbors sharing the same final letters). Crucially, rTMS over the RH impaired lead-, but not end-neighborhood facilitation. The results support the split-fovea theory, where the RH has primacy in representing lead neighbors of a written word

UCL Discovery