5 research outputs found

    A Web-Based Demo to Interactive Multimodal Transcription of Historic Text images

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-04346-8_58[EN] Paleography experts spend many hours transcribing historic documents, and state-of-the-art handwritten text recognition systems are not suitable for performing this task automatically. In this paper we present the modifications oil a previously developed interactive framework for transcription of handwritten text. This system, rather than full automation, aimed at assisting the user with the recognition-transcription process.This work has been supported by the EC (FEDER), the Spanish MEC under grant TIN2006-15694-C02-01 and the research programme Consolider Ingenio 2010 MIPRCV (CSD2007-00018) and by the UPV (FPI fellowship 2006-04).Romero Gómez, V.; Leiva Torres, LA.; Alabau Gonzalvo, V.; Toselli, AH.; Vidal Ruiz, E. (2009). A Web-Based Demo to Interactive Multimodal Transcription of Historic Text images. En Research and Advanced Technology for Digital Libraries: 13th European Conference, ECDL 2009, Corfu, Greece, September 27 - October 2, 2009. Proceedings. Springer Verlag (Germany). 459-460. https://doi.org/10.1007/978-3-642-04346-8_58S459460Toselli, A.H., et al.: Computer assisted transcription of handwritten text. In: Proc. of ICDAR 2007, pp. 944–948. IEEE Computer Society, Los Alamitos (2007)Romero, V., Toselli, A.H., Rodríguez, L., Vidal, E.: Computer assisted transcription for ancient text images. In: Kamel, M.S., Campilho, A. (eds.) ICIAR 2007. LNCS, vol. 4633, pp. 1182–1193. Springer, Heidelberg (2007)Toselli, A.H., et al.: Computer assisted transcription of text images and multimodal interaction. In: Popescu-Belis, A., Stiefelhagen, R. (eds.) MLMI 2008. LNCS, vol. 5237, pp. 296–308. Springer, Heidelberg (2008)Romero, V.: et al.: Interactive multimodal transcription of text images using a web-based demo system. In: Proc. of the IUI, Florida, pp. 477–478 (2009)Romero, V., et al.: Improvements in the computer assisted transciption system of handwritten text images. In: Proc. of the PRIS 2008, pp. 103–112 (2008

    Escritoire: A Multi-touch Desk with e-Pen Input for Capture, Management and Multimodal Interactive Transcription of Handwritten Documents

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_53A large quantity of documents used every day are still handwritten. However, it is interesting to transform each of these documents into its digital version for managing, archiving and sharing. Here we present Escritoire, a multi-touch desk that allows the user to capture, transcribe and work with handwritten documents. The desktop is continuously monitored using two cameras. Whenever the user makes a specific hand gesture over a paper, Escritoire proceeds to take an image. Then, the capture is automatically preprocesses, obtaining as a result an improved representation. Finally, the text image is transcribed using automatic techniques and finally the transcription is displayed on Escritoire.This work was partially supported by the Spanish MEC under FPU scholarship (AP2010-0575), STraDA research project (TIN2012-37475-C02-01) and MITTRAL research project (TIN2009-14633-C03-01); the EU’s 7th Framework Programme under tranScriptorium grant agreement (FP7/2007-2013/600707).Martín-Albo Simón, D.; Romero Gómez, V.; Vidal Ruiz, E. (2015). Escritoire: A Multi-touch Desk with e-Pen Input for Capture, Management and Multimodal Interactive Transcription of Handwritten Documents. En Pattern Recognition and Image Analysis. Springer. 471-478. https://doi.org/10.1007/978-3-319-19390-8_53S471478Andrew, A.: Another efficient algorithm for convex hulls in two dimensions. Inf. Process. Lett. 9(5), 216–219 (1979)Bosch, V., Toselli, A.H., Vidal, E.: Statistical text line analysis in handwritten documents. In: Proceedings of ICFHR (2012)Eisenstein, J., Puerta, A.: Adaptation in automated user-interface design. In: Proceedings of International Conference on Intelligent User Interfaces (2000)Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME-J. Basic Eng. 82(Series D), 35–45 (1960)Keysers, D., Shafait, F., Breuel, T.M.: Document image zone classification - a simple high-performance approach. In: Proceedings of International Conference on Computer Vision Theory (2007)Kozielski, M., Forster, J., Ney, H.: Moment-based image normalization for handwritten text recognition. In: Proceedings of ICFHR (2012)Lampert, C.H., Braun, T., Ulges, A., Keysers, D., Breuel, T.M.: Oblivious document capture and real-time retrieval. In: International Workshop on Camera Based Document Analysis and Recognition (2005)Liang, J., Doermann, D., Li, H.: Camera based analysis of text and documents a survey. Int. J. Doc. Anal. Recogn. 7(2–3), 84–104 (2005)Liwicki, M., Rostanin, O., El-Neklawy, S.M., Dengel, A.: Touch & write: a multi-touch table with pen-input. In: Proceedings of International Workshop on Document Analysis Systems (2010)Marti, U.V., Bunke, H.: Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: Proceedings of ICDAR (2001)Martín-Albo, D., Romero, V., Toselli, A.H., Vidal, E.: Multimodal computer-assisted transcription of text images at character-level interaction. Int. J. Pattern Recogn. Artif. Intell. 26(5), 19 (2012)Martín-Albo, D., Romero, V., Vidal, E.: Interactive off-line handwritten text transcription using on-line handwritten text as feedback. In: Proceedings of ICDAR (2013)Mitra, S., Acharya, T.: Gesture recognition: a survey. IEEE Trans. Syst. Man Cybern. B Cybern. 37(3), 311–324 (2007)Terry, M., Mynatt, E.D.: Recognizing creative needs in user interface design. In: Proceedings of C&C (2002)Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated handwriting recognition and interpretation using finite-state models. Int. J. Pattern Recognit. Artif. Intell. 18(4), 519–539 (2004)Toselli, A.H., Romero, V., Pastor, M., Vidal, E.: Multimodal interactive transcription of text images. Pattern Recognit. 43(5), 1814–1825 (2010)Toselli, A.H., Romero, V., Vidal, E.: Computer assisted transcription of text images and multimodal interaction. In: Popescu-Belis, A., Stiefelhagen, R. (eds.) MLMI 2008. LNCS, vol. 5237, pp. 296–308. Springer, Heidelberg (2008)Wachs, J.P., Kolsch, M., Stern, H., Edan, Y.: Vision-based hand-gesture applications. Commun. ACM. 54(2), 60–71 (2011)Wobbrock, J.O., Morris, M.R., Wilson, A.D.: User-defined gestures for surface computing. In: Proceedings of CHI (2009

    Multimodal Interactive Transcription of Handwritten Text Images

    Full text link
    En esta tesis se presenta un nuevo marco interactivo y multimodal para la transcripción de Documentos manuscritos. Esta aproximación, lejos de proporcionar la transcripción completa pretende asistir al experto en la dura tarea de transcribir. Hasta la fecha, los sistemas de reconocimiento de texto manuscrito disponibles no proporcionan transcripciones aceptables por los usuarios y, generalmente, se requiere la intervención del humano para corregir las transcripciones obtenidas. Estos sistemas han demostrado ser realmente útiles en aplicaciones restringidas y con vocabularios limitados (como es el caso del reconocimiento de direcciones postales o de cantidades numéricas en cheques bancarios), consiguiendo en este tipo de tareas resultados aceptables. Sin embargo, cuando se trabaja con documentos manuscritos sin ningún tipo de restricción (como documentos manuscritos antiguos o texto espontáneo), la tecnología actual solo consigue resultados inaceptables. El escenario interactivo estudiado en esta tesis permite una solución más efectiva. En este escenario, el sistema de reconocimiento y el usuario cooperan para generar la transcripción final de la imagen de texto. El sistema utiliza la imagen de texto y una parte de la transcripción previamente validada (prefijo) para proponer una posible continuación. Despues, el usuario encuentra y corrige el siguente error producido por el sistema, generando así un nuevo prefijo mas largo. Este nuevo prefijo, es utilizado por el sistema para sugerir una nueva hipótesis. La tecnología utilizada se basa en modelos ocultos de Markov y n-gramas. Estos modelos son utilizados aquí de la misma manera que en el reconocimiento automático del habla. Algunas modificaciones en la definición convencional de los n-gramas han sido necesarias para tener en cuenta la retroalimentación del usuario en este sistema.Romero Gómez, V. (2010). Multimodal Interactive Transcription of Handwritten Text Images [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8541Palanci

    Interprétation contextuelle et assistée de fonds d'archives numérisées (application à des registres de ventes du XVIIIe siècle)

    Get PDF
    Les fonds d'archives forment de grandes quantités de documents difficiles à interpréter automatiquement : les approches classiques imposent un lourd effort de conception, sans parvenir à empêcher la production d'erreurs qu'il faut corriger après les traitements.Face à ces limites, notre travail vise à améliorer la processus d'interprétation, en conservant un fonctionnement page par page, et en lui apportant des informations contextuelles extraites du fonds documentaire ou fournies par des opérateurs humains.Nous proposons une extension ciblée de la description d'une page qui permet la mise en place systématique d'échanges entre le processus d'interprétation et son environnement. Un mécanisme global itératif gère l'apport progressif d'informations contextuelles à ce processus, ce qui améliore l'interprétation.L'utilisation de ces nouveaux outils pour le traitement de documents du XVIIIe siècle a montré qu'il était facile d'intégrer nos propositions à un système existant, que sa conception restait simple, et que l'effort de correction pouvait être diminué.Fonds, also called historical document collections, are important amounts of digitized documents which are difficult to interpret automatically: usual approaches require a lot of work during design, but do not manage to avoid producing many errors which have to be corrected after processing.To cope with those limitations, our work aimed at improving the interpretation process by making use of information extracted from the fond, or provided by human operators, while keeping a page by page processing.We proposed a simple extension of page description language which permits to automatically generate information exchange between the interpretation process and its environment. A global iterative mechanism progressively brings contextual information to the later process, and improves interpretation.Experiments and application of those new tools for the processing of documents from the 18th century showed that our propositions were easy to integrate in an existing system, that its design is still simple, and that required manual corrections were reduced.RENNES-INSA (352382210) / SudocSudocFranceF
    corecore