65 research outputs found

    HMM-based Offline Recognition of Handwritten Words Crossed Out with Different Kinds of Strokes

    Get PDF
    In this work, we investigate the recognition of words that have been crossed-out by the writers and are thus degraded. The degradation consists of one or more ink strokes that span the whole word length and simulate the signs that writers use to cross out the words. The simulated strokes are superimposed to the original clean word images. We considered two types of strokes: wave-trajectory strokes created with splines curves and line-trajectory strokes generated with the delta-lognormal model of rapid line movements. The experiments have been performed using a recognition system based on hidden Markov models and the results show that the performance decrease is moderate for single writer data and light strokes, but severe for multiple writer data

    Text Line Segmentation of Historical Documents: a Survey

    Full text link
    There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

    Double-lambda microscopic model for entangled light generation by four-wave-mixing

    Get PDF
    Motivated by recent experiments, we study four-wave-mixing in an atomic double-{\Lambda} system driven by a far-detuned pump. Using the Heisenberg-Langevin formalism, and based on the microscopic properties of the medium, we calculate the classical and quantum properties of seed and conju- gate beams beyond the linear amplifier approximation. A continuous variable approach gives us access to relative-intensity noise spectra that can be directly compared to experiments. Restricting ourselves to the cold-atom regime, we predict the generation of quantum-correlated beams with a relative-intensity noise spectrum well below the standard quantum limit (down to -6 dB). Moreover entanglement between seed and conjugate beams measured by an inseparability down to 0.25 is expected. This work opens the way to the generation of entangled beams by four-wave mixing in a cold atomic sample.Comment: 11 pages, 6 figures, submitted to PR

    Gender identification through handwriting: An online approach

    Get PDF
    The present study was designed to identify writer's gender trough online handwriting and drawing analysis. Two groups - one of 126 males (mean age 24.65, SD=2.45) and the other of 114 females (mean age 24.51, SD=2.50) participants were recruited in the experiment. They were asked to perform seven writing and drawing tasks utilizing a digitizing tablet and a special writing device. Seventeen writing features grouped into five categories have been considered. The experiment's results show that the set of considered features enable to discriminate between male and female writers investigating their performance while copying a house drawing (task 2), writing words in capital letters (task 3) and writing a complete sentence in cursive letters (task 7), in particular focusing on Ductus (number of strokes) and Time categories of writing features

    Enriching Historical Manuscripts: The Bovary Project

    Full text link
    International audienceIn this paper we describe the Bovary Project, a manuscripts digitization project of the famous French writer Gustave FLAUBERT's first great work, which should end in 2006 by providing an online access to an hypertextual edition of "Madame Bovary" drafts set. We rst develop the global context of this project, the main objectives, and then focus particularly on the document analysis problem. Finally we propose a new approach for the segmentation of handwritten documents

    Photoionisation loading of large Sr+ ion clouds with ultrafast pulses

    Get PDF
    This paper reports on photoionisation loading based on ultrafast pulses of singly-ionised strontium ions in a linear Paul trap. We take advantage of an autoionising resonance of Sr neutral atoms to form Sr+ by two-photon absorption of femtosecond pulses at a wavelength of 431nm. We compare this technique to electron-bombardment ionisation and observe several advantages of photoionisation. It actually allows the loading of a pure Sr+ ion cloud in a low radio-frequency voltage amplitude regime. In these conditions up to 4x10^4 laser-cooled Sr+ ions were trapped

    Interactive handwriting recognition with limited user effort

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10032-013-0204-5[EN] Transcription of handwritten text in (old) documents is an important, time-consuming task for digital libraries. Although post-editing automatic recognition of handwritten text is feasible, it is not clearly better than simply ignoring it and transcribing the document from scratch. A more effective approach is to follow an interactive approach in which both the system is guided by the user, and the user is assisted by the system to complete the transcription task as efficiently as possible. Nevertheless, in some applications, the user effort available to transcribe documents is limited and fully supervision of the system output is not realistic. To circumvent these problems, we propose a novel interactive approach which efficiently employs user effort to transcribe a document by improving three different aspects. Firstly, the system employs a limited amount of effort to solely supervise recognised words that are likely to be incorrect. Thus, user effort is efficiently focused on the supervision of words for which the system is not confident enough. Secondly, it refines the initial transcription provided to the user by recomputing it constrained to user supervisions. In this way, incorrect words in unsupervised parts can be automatically amended without user supervision. Finally, it improves the underlying system models by retraining the system from partially supervised transcriptions. In order to prove these statements, empirical results are presented on two real databases showing that the proposed approach can notably reduce user effort in the transcription of handwritten text in (old) documents.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under Grant Agreement No 287755 (transLectures). Also supported by the Spanish Government (MICINN, MITyC, "Plan E", under Grants MIPRCV "Consolider Ingenio 2010", MITTRAL (TIN2009-14633-C03-01), erudito.com (TSI-020110-2009-439), iTrans2 (TIN2009-14511), and FPU (AP2007-02867), and the Generalitat Valenciana (Grants Prometeo/2009/014 and GV/2010/067).Serrano Martinez Santos, N.; GimĂ©nez Pastor, A.; Civera Saiz, J.; Sanchis Navarro, JA.; Juan CĂ­scar, A. (2014). Interactive handwriting recognition with limited user effort. International Journal on Document Analysis and Recognition. 17(1):47-59. https://doi.org/10.1007/s10032-013-0204-5S4759171Agua, M., Serrano, N., Civera, J., Juan, A.: Character-based handwritten text recognition of multilingual documents. In: Proceedings of Advances in Speech and Language Technologies for Iberian Languages (IBERSPEECH 2012), Madrid (Spain), pp. 187–196 (2012)Ahn, L.V., Maurer, B., Mcmillen, C., Abraham, D., Blum, M.: reCAPTCHA: human-based character recognition via web security measures. Science 321, 1465–1468 (2008)Barrachina, S., Bender, O., Casacuberta, F., Civera, J., Cubel, E., Khadivi, S., Lagarda, A.L., Ney, H., TomĂĄs, J., Vidal, E.: Statistical approaches to computer-assisted translation. Comput. Linguist. 35(1), 3–28 (2009)Bertolami, R., Bunke, H.: Hidden markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recognit. 41, 3452–3460 (2008)Bunke, H., Bengio, S., Vinciarelli, A.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 709–720 (2004)Dreuw, P., Jonas, S., Ney, H.: White-space models for offline Arabic handwriting recognition. In: Proceedings of the 19th International Conference on, Pattern Recognition, pp. 1–4 (2008)Efron, B., Tibshirani, R.J.: An introduction to bootstrap. Chapman and Hall/CRC, London (1994)Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.: Automatic transcription of handwritten medieval documents. In: Proceedings of the 15th International Conference on Virtual Systems and Multimedia, pp. 137–142 (2009)Frinken, V., Bunke, H.: Evaluating retraining rules for semi-supervised learning in neural network based cursive word recognition. In: Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona (Spain), pp. 31–35 (2009)Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)Hakkani-TĂŒr, D., Riccardi, G., Tur, G.: An active approach to spoken language processing. ACM Trans. Speech Lang. Process. 3, 1–31 (2006)Kristjannson, T., Culotta, A., Viola, P., McCallum, A.: Interactive information extraction with constrained conditional random fields. In: Proceedings of the 19th Natural Conference on Artificial Intelligence, San Jose, CA (USA), pp. 412–418 (2004)Laurence Likforman-Sulem, A.Z., Taconet, B.: Text line segmentation of historical documents: a survey. Int. J. Doc. Anal. Recognit. 9, 123–138 (2007)Le Bourgeois, F., Emptoz, H.: Debora: digital access to books of the renaissance. Int. J. Doc. Anal. Recognit. 9, 193–221 (2007)Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)Neal, R.M., Hinton, G.E.: Learning in graphical models. In: A View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants, Chap. MIT Press, Cambridge, MA, USA, pp. 355–368 (1999)PĂ©rez, D., TarazĂłn, L., Serrano, N., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona (Spain), pp. 301–305 (2009)Plötz, T., Fink, G.A.: Markov models for offline handwriting recognition: a survey. Int. J. Doc. Anal. Recognit. 12(4), 269–298 (2009)Quiniou, S., Cheriet, M., Anquetil, E.: Error handling approach using characterization and correction steps for handwritten document analysis. Int. J. Doc. Anal. Recognit. 15(2), 125–141 (2012)RodrĂ­guez, L., GarcĂ­a-Varea, I., Vidal, E.: Multi-modal computer assisted speech transcription. In: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction, ACM, New York, NY, USA, pp. 30:1–30:7 (2010)Serrano, N., PĂ©rez, D., Sanchis, A., Juan, A.: Adaptation from partially supervised handwritten text transcriptions. In: Proceedings of the 11th International Conference on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction, Cambridge, MA (USA), pp. 289–292 (2009)Serrano, N., Castro, F., Juan, A.: The RODRIGO database. In: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valleta (Malta), pp. 2709–2712 (2010)Serrano, N., GimĂ©nez, A., Sanchis, A., Juan, A.: Active learning strategies for handwritten text transcription. In: Proceedings of the 12th International Conference on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal, Interaction, Beijing (China) (2010)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proceedings of the 15th International Conference on Intelligent User Interfaces, Hong Kong (China), pp. 373–376 (2010)Serrano, N., TarazĂłn, L., PĂ©rez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proceedings of the 10th International Workshop on Pattern Recognition in Information Systems, Funchal (Portugal), pp. 82–89 (2010)Settles, B.: Active Learning Literature Survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison (2009)TarazĂłn, L., PĂ©rez, D., Serrano, N., Alabau, V., Ramos-Terrades, O., Sanchis, A., Juan, A.: Confidence measures for error correction in interactive transcription of handwritten text. In: Proceedings of the 15th International Conference on Image Analysis, Processing, Vietri sul Mare (Italy) (2009)Toselli, A., Juan, A., Keysers, D., GonzĂĄlez, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated handwriting recognition and interpretation using finite-state models. Int. J. Pattern Recognit. Artif. Intell. 18(4), 519–539 (2004)Toselli, A., Romero, V., RodrĂ­guez, L., Vidal, E.: Computer assisted transcription of handwritten text. In: Proceedings of the 9th International Conference on Document Analysis and Recognition, Curitiba (Brazil), pp. 944–948 (2007)Valor, J., PĂ©rez, A., Civera, J., Juan, A.: Integrating a state-of-the-art ASR system into the opencast Matterhorn platform. In: Proceedings of the Advances in Speech and Language Technologies for Iberian Languages (IBERSPEECH 2012), Madrid (Spain), pp. 237–246 (2012)Wessel, F., Ney, H.: Unsupervised training of acoustic models for large vocabulary continuous speech recognition. IEEE Trans Speech Audio Process 13(1), 23–31 (2005

    In2Se3

    No full text
    • 

    corecore