1,275 research outputs found
Language identification for interactive handwriting transcription of multilingual documents
An effective approach to handwriting transcription of (old) documents is to follow a sequential, line-by-line transcription of the whole document, in which a continuously retrained system interacts with the user. In the case of multilingual documents, however, a minor yet important issue for this interactive approach is to first identify the language of the current text line image to be transcribed. In this paper, we propose a probabilistic framework and three techniques for this purpose. Empirical results are reported on an entire 764-page multilingual document for which previous empirical tests were limited to its first 180 pages, written only in Spanish. © 2011 Springer-Verlag.Work supported by the EC (FEDER, FSE), the Spanish Government (MICINN, MITyC, “Plan E”, under grants MIPRCV “Consolider Ingenio 2010”, MITTRAL TIN2009- 14633-C03-01 and FPU AP2007-02867), the Generalitat Valenciana (grant Prometeo/2009/014 and ACOMP/2010/051) and the UPV (grant 20080033).Del Agua Teba, MA.; Serrano Martinez Santos, N.; Juan CĂscar, A. (2011). Language identification for interactive handwriting transcription of multilingual documents. En Pattern Recognition and Image Analysis. Springer Verlag (Germany). 6669:596-603. https://doi.org/10.1007/978-3-642-21257-4_74S5966036669del Agua, M.A.: Multilingualidad en el reconocimiento de texto manuscrito. Final Degree Project (2010)Ghosh, D., Dube, T., Shivaprasad, P.: Script Recognition: A Review. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 32(12), 2142–2161 (2010)PĂ©rez, D., TarazĂłn, L., Serrano, N., Castro, F., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proc. of the 10th Int. Conf. on Document Analysis and Recognition (ICDAR 2009), Barcelona, Spain, pp. 301–305 (2009)Plötz, T., Fink, G.: Markov models for offline handwriting recognition: a survey. Int. J. on Document Analysis and Recognition (IJDAR) 12(4), 269–298 (2009)Serrano, N., GimĂ©nez, A., Sanchis, A., Juan, A.: Active learning strategies in handwritten text recognition. In: Proc. of the 12th Int. Conf. on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2010), Beijing (China), vol. 86 (November 2010)Serrano, N., PĂ©rez, D., Sanchis, A., Juan, A.: Adaptation from Partially Supervised Handwritten Text Transcriptions. In: Proc. of the 11th Int. Conf. on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2009), Cambridge, MA (USA), pp. 289–292 (2009)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proc. of the Int. Conf. on Intelligent User Interfaces (IUI 2010), Hong Kong (China), pp. 373–376 (2010)Serrano, N., TarazĂłn, L., PĂ©rez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proc. of the 10th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2010), Funchal (Portugal), pp. 82–89 (2010
Character-Based Handwritten Text Recognition of Multilingual Documents
[EN] An effective approach to transcribe handwritten text documents is to follow a sequential interactive approach. During the supervision phase, user corrections are incorporated into the system through an ongoing retraining process. In the case of multilingual documents with a high percentage of out-of-vocabulary (OOV) words, two principal issues arise. On the one hand, a minor yet important matter for this interactive approach is to identify the language of the current text line image to be transcribed, as a language dependent recognisers typically performs better than a monolingual recogniser. On the other hand, word-based language models suffer from data scarcity in the presence of a large number of OOV words, degrading their estimation and affecting the performance of the transcription system. In this paper, we successfully tackle both issues deploying character-based language models combined with language identification techniques on an entire 764-page multilingual document. The results obtained significantly reduce previously reported results in terms of transcription error on the same task, but showed that a language dependent approach is not effective on top of character-based recognition of similar languages.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement nâ—¦ 287755. Also supported by the Spanish Government (MIPRCV ”Consolider Ingenio 2010”, iTrans2 TIN2009-14511, MITTRAL TIN2009-14633-C03-01 and FPU AP2007-0286) and the Generalitat Valenciana (Prometeo/2009/014).Del Agua Teba, MA.; Serrano Martinez Santos, N.; Civera Saiz, J.; Juan CĂscar, A. (2012). Character-Based Handwritten Text Recognition of Multilingual Documents. Communications in Computer and Information Science. 328:187-196. https://doi.org/10.1007/978-3-642-35292-8_20S187196328Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)Serrano, N., TarazĂłn, L., PĂ©rez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proc. of the 10th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2010), Funchal, Portugal, pp. 82–89 (2010)Serrano, N., PĂ©rez, D., Sanchis, A., Juan, A.: Adaptation from Partially Supervised Handwritten Text Transcriptions. In: Proc. of the 11th Int. Conf. on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2009), Cambridge, MA, USA, pp. 289–292 (2009)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proc. of the Int. Conf. on Intelligent User Interfaces (IUI 2010), Hong Kong, China, pp. 373–376 (2010)Serrano, N., GimĂ©nez, A., Sanchis, A., Juan, A.: Active learning strategies in handwritten text recognition. In: Proc. of the 12th Int. Conf. on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2010), Beijing, China, vol. (86) (November 2010)PĂ©rez, D., TarazĂłn, L., Serrano, N., Castro, F., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proc. of the 10th Int. Conf. on Document Analysis and Recognition (ICDAR 2009), Barcelona, Spain, pp. 301–305 (2009)del Agua, M.A., Serrano, N., Juan, A.: Language Identification for Interactive Handwriting Transcription of Multilingual Documents. In: VitriĂ , J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 596–603. Springer, Heidelberg (2011)Ghosh, D., Dube, T., Shivaprasad, P.: Script Recognition: A Review. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 32(12), 2142–2161 (2010)Bisani, M., Ney, H.: Open vocabulary speech recognition with flat hybrid models. In: Proc. of the European Conf. on Speech Communication and Technology, pp. 725–728 (2005)Szoke, I., Burget, L., Cernocky, J., Fapso, M.: Sub-word modeling of out of vocabulary words in spoken term detection. In: IEEE Spoken Language Technology Workshop, SLT 2008, pp. 273–276 (December 2008)Brakensiek, A., Rottl, J., Kosmala, A., Rigoll, G.: Off-Line handwriting recognition using various hybrid modeling techniques and character N-Grams. In: 7th International Workshop on Frontiers in Handwritten Recognition, pp. 343–352 (2000)Zamora, F., Castro, M.J., España, S., Gorbe, J.: Unconstrained offline handwriting recognition using connectionist character n-grams. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (July 2010)Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for off-line handwriting recognition. IJDAR, 39–46 (2002)Schultz, T., Kirchhoff, K.: Multilingual Speech Processing (2006)Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Proc. of ICSLP 2002, pp. 901–904 (September 2002)Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., SchlĂĽter, R., Ney, H.: The RWTH aachen university open source speech recognition system. In: Interspeech, Brighton, U.K., pp. 2111–2114 (September 2009)Efron, B., Tibshirani, R.J.: An Introduction to Bootstrap. Chapman & Hall/CRC (1994
Contributions to Adaptation on Automatic Speech Recognition and Multilingual Handwritten Text Recognition
[ES En este trabajo se han estudiado mĂ©todos para adaptaciĂłn en la transcripciĂłn automática de documentos manuscritos multilĂngĂĽes y además se han propuesto diversas tĂ©cnicas para identificaciĂłn del idioma. Por otra parte, en cuanto a la adaptaciĂłn en el reconocimiento del habla, se aplica una tĂ©cnica conocida de adaptaciĂłn al locutor (MLLR).
Por último, se presenta una aplicación en el marco del proyecto transLectures, cuyo objetivo es integrar en un reproductor multimedia un sistema de transcripción automática interactivo.[EN] In this work we have studied several methods for adaptation on automatic handwriting recognition of multilingual documents, so as it has been propossed different language identification techniques. Regarding adaptation in speech recognition, it has been applied a well-known technique for speaker adaptation (MLLR).
Finally, we present an application under the transLectures project, which is a real application example of an interactive automatic transcription system within a video player.Del Agua Teba, MA. (2012). Contributions to Adaptation on Automatic Speech Recognition and Multilingual Handwritten Text Recognition. http://hdl.handle.net/10251/19115Archivo delegad
Interactive Transcription of Old Text Documents
Nowadays, there are huge collections of handwritten text documents in libraries
all over the world. The high demand for these resources has led to the creation
of digital libraries in order to facilitate the preservation and provide electronic
access to these documents. However text transcription of these documents im-
ages are not always available to allow users to quickly search information, or
computers to process the information, search patterns or draw out statistics.
The problem is that manual transcription of these documents is an expensive
task from both economical and time viewpoints. This thesis presents a novel ap-
proach for e cient Computer Assisted Transcription (CAT) of handwritten text
documents using state-of-the-art Handwriting Text Recognition (HTR) systems.
The objective of CAT approaches is to e ciently complete a transcription
task through human-machine collaboration, as the e ort required to generate a
manual transcription is high, and automatically generated transcriptions from
state-of-the-art systems still do not reach the accuracy required. This thesis
is centered on a special application of CAT, that is, the transcription of old
text document when the quantity of user e ort available is limited, and thus,
the entire document cannot be revised. In this approach, the objective is to
generate the best possible transcription by means of the user e ort available.
This thesis provides a comprehensive view of the CAT process from feature
extraction to user interaction.
First, a statistical approach to generalise interactive transcription is pro-
posed. As its direct application is unfeasible, some assumptions are made to
apply it to two di erent tasks. First, on the interactive transcription of hand-
written text documents, and next, on the interactive detection of the document
layout.
Next, the digitisation and annotation process of two real old text documents
is described. This process was carried out because of the scarcity of similar
resources and the need of annotated data to thoroughly test all the developed
tools and techniques in this thesis. These two documents were carefully selected
to represent the general di culties that are encountered when dealing with
HTR. Baseline results are presented on these two documents to settle down a
benchmark with a standard HTR system. Finally, these annotated documents
were made freely available to the community. It must be noted that, all the
techniques and methods developed in this thesis have been assessed on these
two real old text documents.
Then, a CAT approach for HTR when user e ort is limited is studied and
extensively tested. The ultimate goal of applying CAT is achieved by putting
together three processes. Given a recognised transcription from an HTR system.
The rst process consists in locating (possibly) incorrect words and employs the
user e ort available to supervise them (if necessary). As most words are not
expected to be supervised due to the limited user e ort available, only a few are
selected to be revised. The system presents to the user a small subset of these
words according to an estimation of their correctness, or to be more precise,
according to their con dence level. Next, the second process starts once these low con dence words have been supervised. This process updates the recogni-
tion of the document taking user corrections into consideration, which improves
the quality of those words that were not revised by the user. Finally, the last
process adapts the system from the partially revised (and possibly not perfect)
transcription obtained so far. In this adaptation, the system intelligently selects
the correct words of the transcription. As results, the adapted system will bet-
ter recognise future transcriptions. Transcription experiments using this CAT
approach show that this approach is mostly e ective when user e ort is low.
The last contribution of this thesis is a method for balancing the nal tran-
scription quality and the supervision e ort applied using our previously de-
scribed CAT approach. In other words, this method allows the user to control
the amount of errors in the transcriptions obtained from a CAT approach. The
motivation of this method is to let users decide on the nal quality of the desired
documents, as partially erroneous transcriptions can be su cient to convey the
meaning, and the user e ort required to transcribe them might be signi cantly
lower when compared to obtaining a totally manual transcription. Consequently,
the system estimates the minimum user e ort required to reach the amount of
error de ned by the user. Error estimation is performed by computing sepa-
rately the error produced by each recognised word, and thus, asking the user to
only revise the ones in which most errors occur.
Additionally, an interactive prototype is presented, which integrates most
of the interactive techniques presented in this thesis. This prototype has been
developed to be used by palaeographic expert, who do not have any background
in HTR technologies. After a slight ne tuning by a HTR expert, the prototype
lets the transcribers to manually annotate the document or employ the CAT ap-
proach presented. All automatic operations, such as recognition, are performed
in background, detaching the transcriber from the details of the system. The
prototype was assessed by an expert transcriber and showed to be adequate and
e cient for its purpose. The prototype is freely available under a GNU Public
Licence (GPL).Serrano MartĂnez-Santos, N. (2014). Interactive Transcription of Old Text Documents [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37979TESI
- …