487,932 research outputs found

    Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization

    Full text link
    In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the curse of dimensionality can disturb the performance of summarizers. This paper describes a new method for normalization of words to further reduce the space of representation. We propose to reduce each word to its initial letters, as a form of Ultra-stemming. The results show that Ultra-stemming not only preserve the content of summaries produced by this representation, but often the performances of the systems can be dramatically improved. Summaries on trilingual corpora were evaluated automatically with Fresa. Results confirm an increase in the performance, regardless of summarizer system used.Comment: 22 pages, 12 figures, 9 table

    Consecutive retrieval with redundancy: an optimal linear and an optimal cyclic arrangement and their storage space requirements

    Get PDF
    Information retrieval, file organization, consecutive retrieval property, consecutive retrieval with redundancy, storage space requirements 1

    Quantum cryptography: key distribution and beyond

    Full text link
    Uniquely among the sciences, quantum cryptography has driven both foundational research as well as practical real-life applications. We review the progress of quantum cryptography in the last decade, covering quantum key distribution and other applications.Comment: It's a review on quantum cryptography and it is not restricted to QK

    Handwriting styles: benchmarks and evaluation metrics

    Full text link
    Evaluating the style of handwriting generation is a challenging problem, since it is not well defined. It is a key component in order to develop in developing systems with more personalized experiences with humans. In this paper, we propose baseline benchmarks, in order to set anchors to estimate the relative quality of different handwriting style methods. This will be done using deep learning techniques, which have shown remarkable results in different machine learning tasks, learning classification, regression, and most relevant to our work, generating temporal sequences. We discuss the challenges associated with evaluating our methods, which is related to evaluation of generative models in general. We then propose evaluation metrics, which we find relevant to this problem, and we discuss how we evaluate the evaluation metrics. In this study, we use IRON-OFF dataset. To the best of our knowledge, there is no work done before in generating handwriting (either in terms of methodology or the performance metrics), our in exploring styles using this dataset.Comment: Submitted to IEEE International Workshop on Deep and Transfer Learning (DTL 2018

    A magnetic stimulation examination of orthographic neighborhood effects in visual word recognition

    Get PDF
    The split-fovea theory proposes that visual word recognition is mediated by the splitting of the foveal image, with letters to the left of fixation projected to the right hemisphere (RH) and letters to the right of fixation projected to the left hemisphere (LH). We applied repetitive transcranial magnetic stimulation (rTMS) over the left and right occipital cortex during a lexical decision task to investigate the extent to which word recognition processes could be accounted for according to the split-fovea theory. Unilateral rTMS significantly impaired lexical decision latencies to centrally presented words, supporting the suggestion that foveal representation of words is split between the cerebral hemispheres rather than bilateral. Behaviorally, we showed that words that have many orthographic neighbors sharing the same initial letters ("lead neighbors") facilitated lexical decision more than words with few lead neighbors. This effect did not apply to end neighbors (orthographic neighbors sharing the same final letters). Crucially, rTMS over the RH impaired lead-, but not end-neighborhood facilitation. The results support the split-fovea theory, where the RH has primacy in representing lead neighbors of a written word
    • …
    corecore