47 research outputs found

    A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

    Full text link
    Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors

    Character-Based Handwritten Text Recognition of Multilingual Documents

    Full text link
    [EN] An effective approach to transcribe handwritten text documents is to follow a sequential interactive approach. During the supervision phase, user corrections are incorporated into the system through an ongoing retraining process. In the case of multilingual documents with a high percentage of out-of-vocabulary (OOV) words, two principal issues arise. On the one hand, a minor yet important matter for this interactive approach is to identify the language of the current text line image to be transcribed, as a language dependent recognisers typically performs better than a monolingual recogniser. On the other hand, word-based language models suffer from data scarcity in the presence of a large number of OOV words, degrading their estimation and affecting the performance of the transcription system. In this paper, we successfully tackle both issues deploying character-based language models combined with language identification techniques on an entire 764-page multilingual document. The results obtained significantly reduce previously reported results in terms of transcription error on the same task, but showed that a language dependent approach is not effective on top of character-based recognition of similar languages.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n◦ 287755. Also supported by the Spanish Government (MIPRCV ”Consolider Ingenio 2010”, iTrans2 TIN2009-14511, MITTRAL TIN2009-14633-C03-01 and FPU AP2007-0286) and the Generalitat Valenciana (Prometeo/2009/014).Del Agua Teba, MA.; Serrano Martinez Santos, N.; Civera Saiz, J.; Juan Císcar, A. (2012). Character-Based Handwritten Text Recognition of Multilingual Documents. Communications in Computer and Information Science. 328:187-196. https://doi.org/10.1007/978-3-642-35292-8_20S187196328Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)Serrano, N., Tarazón, L., Pérez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proc. of the 10th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2010), Funchal, Portugal, pp. 82–89 (2010)Serrano, N., Pérez, D., Sanchis, A., Juan, A.: Adaptation from Partially Supervised Handwritten Text Transcriptions. In: Proc. of the 11th Int. Conf. on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2009), Cambridge, MA, USA, pp. 289–292 (2009)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proc. of the Int. Conf. on Intelligent User Interfaces (IUI 2010), Hong Kong, China, pp. 373–376 (2010)Serrano, N., Giménez, A., Sanchis, A., Juan, A.: Active learning strategies in handwritten text recognition. In: Proc. of the 12th Int. Conf. on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2010), Beijing, China, vol. (86) (November 2010)Pérez, D., Tarazón, L., Serrano, N., Castro, F., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proc. of the 10th Int. Conf. on Document Analysis and Recognition (ICDAR 2009), Barcelona, Spain, pp. 301–305 (2009)del Agua, M.A., Serrano, N., Juan, A.: Language Identification for Interactive Handwriting Transcription of Multilingual Documents. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 596–603. Springer, Heidelberg (2011)Ghosh, D., Dube, T., Shivaprasad, P.: Script Recognition: A Review. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 32(12), 2142–2161 (2010)Bisani, M., Ney, H.: Open vocabulary speech recognition with flat hybrid models. In: Proc. of the European Conf. on Speech Communication and Technology, pp. 725–728 (2005)Szoke, I., Burget, L., Cernocky, J., Fapso, M.: Sub-word modeling of out of vocabulary words in spoken term detection. In: IEEE Spoken Language Technology Workshop, SLT 2008, pp. 273–276 (December 2008)Brakensiek, A., Rottl, J., Kosmala, A., Rigoll, G.: Off-Line handwriting recognition using various hybrid modeling techniques and character N-Grams. In: 7th International Workshop on Frontiers in Handwritten Recognition, pp. 343–352 (2000)Zamora, F., Castro, M.J., España, S., Gorbe, J.: Unconstrained offline handwriting recognition using connectionist character n-grams. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (July 2010)Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for off-line handwriting recognition. IJDAR, 39–46 (2002)Schultz, T., Kirchhoff, K.: Multilingual Speech Processing (2006)Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Proc. of ICSLP 2002, pp. 901–904 (September 2002)Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., Ney, H.: The RWTH aachen university open source speech recognition system. In: Interspeech, Brighton, U.K., pp. 2111–2114 (September 2009)Efron, B., Tibshirani, R.J.: An Introduction to Bootstrap. Chapman & Hall/CRC (1994

    Human or Computer Assisted Interactive Transcription: Automated Text Recognition, Text Annotation, and Scholarly Edition in the Twenty-First Century

    Get PDF
    Computer assisted transcription tools can speed up the initial process of reading and transcribing texts. At the same time, new annotation tools open new ways of accessing the text in its graphical form. The balance and value of each method still needs to be explored. STATE, a complete assisted transcription system for ancient documents, was presented to the audience of the 2013 International Medieval Congress at Leeds. The system offers a multimodal interaction environment to assist humans in transcribing ancient documents: the user can type, write on the screen with a stylus, or utter a word. When one of these actions is used to correct an erroneous word, the system uses this new information to look for other mistakes in the rest of the line. The system is modular, composed of different parts: one part creates projects from a set of images of documents, another part controls an automatic transcription system, and the third part allows the user to interact with the transcriptions and easily correct them as needed. This division of labour allows great flexibility for organising the work in a team of transcribers.Las herramientas de ayuda a la transcripción automática pueden acelerar el proceso inicial de la lectura y transcripción de textos. Al mismo tiempo, las nuevas herramientas de anotación aportan nuevas formas de acceder al texto en su forma original gráfica. Sin embargo, todavía es necesario evaluar las bondades y capacidades de los distintos métodos. STATE, un completo sistema de asistencia a la transcripción de documentos antiguos, se presentó a la audiencia del International Medieval Congress de 2013 celebrado en Leeds. El sistema ofrece un entorno de interacción multimodal para ayudar a las personas en la transcripción de documentos antiguos: el usuario puede teclear, escribir en la pantalla con un lápiz óptico o corregir usando la voz. Cada vez que el usuario cambia de esta forma una palabra, el sistema utiliza la corrección para buscar errores en el resto de la línea. El sistema está dividido en diferentes módulos: uno crea proyectos a partir de un conjunto de imágenes de documentos, otro módulo controla el sistema de transcripción automática, y un tercer módulo permite al usuario interactuar con las transcripciones y corregirlas fácilmente cuando sea necesario. Esta división de las tareas permite una gran flexibilidad para organizar el trabajo de los transcriptores en equipo

    Transcrição humana ou assistência interativa computadorizada: reconhecimento automático, anotação e edição erudite no século XXI

    Get PDF
    Computer assisted transcription tools can speed up the initial process of reading and transcribing texts. At the same time, new annotation tools open new ways of accessing the text in its graphical form. The balance and value of each method still needs to be explored. STATE, a complete assisted transcription system for ancient documents, was presented to the audience of the 2013 International Medieval Congress at Leeds. The system offers a multimodal interaction environment to assist humans in transcribing ancient documents: the user can type, write on the screen with a stylus, or utter a word. When one of these actions is used to correct an erroneous word, the system uses this new information to look for other mistakes in the rest of the line. The system is modular, composed of different parts: one part creates projects from a set of images of documents, another part controls an automatic transcription system, and the third part allows the user to interact with the transcriptions and easily correct them as needed. This division of labour allows great flexibility for organising the work in a team of transcribers.Las herramientas de ayuda a la transcripción automática pueden acelerar el proceso inicial de la lectura y transcripción de textos. Al mismo tiempo, las nuevas herramientas de anotación aportan nuevas formas de acceder al texto en su forma original gráfica. Sin embargo, todavía es necesario evaluar las bondades y capacidades de los distintos métodos. STATE, un completo sistema de asistencia a la transcripción de documentos antiguos, se presentó a la audiencia del International Medieval Congress de 2013 celebrado en Leeds. El sistema ofrece un entorno de interacción multimodal para ayudar a las personas en la transcripción de documentos antiguos: el usuario puede teclear, escribir en la pantalla con un lápiz óptico o corregir usando la voz. Cada vez que el usuario cambia de esta forma una palabra, el sistema utiliza la corrección para buscar errores en el resto de la línea. El sistema está dividido en diferentes módulos: uno crea proyectos a partir de un conjunto de imágenes de documentos, otro módulo controla el sistema de transcripción automática, y un tercer módulo permite al usuario interactuar con las transcripciones y corregirlas fácilmente cuando sea necesario. Esta división de las tareas permite una gran flexibilidad para organizar el trabajo de los transcriptores en equipo.Work supported by the Spanish Government (TIN2010-18958) and the Generalitat Valenciana (Prometeo/2010/028)