44,657 research outputs found
Text Line Segmentation of Historical Documents: a Survey
There is a huge amount of historical documents in libraries and in various
National Archives that have not been exploited electronically. Although
automatic reading of complete pages remains, in most cases, a long-term
objective, tasks such as word spotting, text/image alignment, authentication
and extraction of specific fields are in use today. For all these tasks, a
major step is document segmentation into text lines. Because of the low quality
and the complexity of these documents (background noise, artifacts due to
aging, interfering lines),automatic text line segmentation remains an open
research field. The objective of this paper is to present a survey of existing
methods, developed during the last decade, and dedicated to documents of
historical interest.Comment: 25 pages, submitted version, To appear in International Journal on
Document Analysis and Recognition, On line version available at
http://www.springerlink.com/content/k2813176280456k3
Transcribing a 17th-century botanical manuscript: Longitudinal evaluation of document layout detection and interactive transcription
[EN] We present a process for cost-effective transcription of cursive handwritten text
images that has been tested on a 1,000-page 17th-century book about botanical
species. The process comprised two main tasks, namely: (1) preprocessing: page
layout analysis, text line detection, and extraction; and (2) transcription of the
extracted text line images. Both tasks were carried out with semiautomatic pro-
cedures, aimed at incrementally minimizing user correction effort, by means of
computer-assisted line detection and interactive handwritten text recognition
technologies. The contribution derived from this work is three-fold. First, we
provide a detailed human-supervised transcription of a relatively large historical
handwritten book, ready to be searchable, indexable, and accessible to cultural
heritage scholars as well as the general public. Second, we have conducted the
first longitudinal study to date on interactive handwriting text recognition, for
which we provide a very comprehensive user assessment of the real-world per-
formance of the technologies involved in this work. Third, as a result of this
process, we have produced a detailed transcription and document layout infor-
mation (i.e. high-quality labeled data) ready to be used by researchers working on
automated technologies for document analysis and recognition.This work is supported by the European Commission through the EU projects HIMANIS (JPICH program, Spanish, grant Ref. PCIN-2015-068) and READ (Horizon-2020 program, grant Ref. 674943); and the Universitat Politecnica de Valencia (grant number SP20130189). This work was also part of the Valorization and I+D+i Resources program of VLC/CAMPUS and has been funded by the Spanish MECD as part of the International Excellence Campus program.Toselli, AH.; Leiva, LA.; Bordes-Cabrera, I.; Hernández-Tornero, C.; Bosch Campos, V.; Vidal, E. (2018). Transcribing a 17th-century botanical manuscript: Longitudinal evaluation of document layout detection and interactive transcription. Digital Scholarship in the Humanities. 33(1):173-202. https://doi.org/10.1093/llc/fqw064S173202331Bazzi, I., Schwartz, R., & Makhoul, J. (1999). An omnifont open-vocabulary OCR system for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(6), 495-504. doi:10.1109/34.771314Causer, T., Tonra, J., & Wallace, V. (2012). Transcription maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham*. Literary and Linguistic Computing, 27(2), 119-137. doi:10.1093/llc/fqs004Ramel, J. Y., Leriche, S., Demonet, M. L., & Busson, S. (2007). User-driven page layout analysis of historical printed books. International Journal of Document Analysis and Recognition (IJDAR), 9(2-4), 243-261. doi:10.1007/s10032-007-0040-6Romero, V., Fornés, A., Serrano, N., Sánchez, J. A., Toselli, A. H., Frinken, V., … Lladós, J. (2013). The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. Pattern Recognition, 46(6), 1658-1669. doi:10.1016/j.patcog.2012.11.024Romero, V., Toselli, A. H., & Vidal, E. (2012). Multimodal Interactive Handwritten Text Transcription. Series in Machine Perception and Artificial Intelligence. doi:10.1142/8394Toselli, A. H., Romero, V., Pastor, M., & Vidal, E. (2010). Multimodal interactive transcription of text images. Pattern Recognition, 43(5), 1814-1825. doi:10.1016/j.patcog.2009.11.019Toselli, A. H., Vidal, E., Romero, V., & Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences, 370-371, 497-518. doi:10.1016/j.ins.2016.07.063Bunke, H., Bengio, S., & Vinciarelli, A. (2004). Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 709-720. doi:10.1109/tpami.2004.1
Recognizing Degraded Handwritten Characters
In this paper, Slavonic manuscripts from the 11th
century written in Glagolitic script are
investigated. State-of-the-art optical character recognition methods produce poor results
for degraded handwritten document images. This is largely due to a lack of suitable
results from basic pre-processing steps such as binarization and image segmentation.
Therefore, a new, binarization-free approach will be presented that is independent of
pre-processing deficiencies. It additionally incorporates local information in order to
recognize also fragmented or faded characters. The proposed algorithm consists of
two steps: character classification and character localization. Firstly scale invariant
feature transform features are extracted and classified using support vector machines.
On this basis interest points are clustered according to their spatial information. Then,
characters are localized and eventually recognized by a weighted voting scheme of
pre-classified local descriptors. Preliminary results show that the proposed system can
handle highly degraded manuscript images with background noise, e.g. stains, tears,
and faded characters
On-the-fly Historical Handwritten Text Annotation
The performance of information retrieval algorithms depends upon the
availability of ground truth labels annotated by experts. This is an important
prerequisite, and difficulties arise when the annotated ground truth labels are
incorrect or incomplete due to high levels of degradation. To address this
problem, this paper presents a simple method to perform on-the-fly annotation
of degraded historical handwritten text in ancient manuscripts. The proposed
method aims at quick generation of ground truth and correction of inaccurate
annotations such that the bounding box perfectly encapsulates the word, and
contains no added noise from the background or surroundings. This method will
potentially be of help to historians and researchers in generating and
correcting word labels in a document dynamically. The effectiveness of the
annotation method is empirically evaluated on an archival manuscript collection
from well-known publicly available datasets
Image and interpretation using artificial intelligence to read ancient Roman texts
The ink and stylus tablets discovered at the Roman Fort of Vindolanda are a unique resource for scholars of ancient history. However, the stylus tablets have proved particularly difficult to read. This paper describes a system that assists expert papyrologists in the interpretation of the Vindolanda writing tablets. A model-based approach is taken that relies on models of the written form of characters, and statistical modelling of language, to produce plausible interpretations of the documents. Fusion of the contributions from the language, character, and image feature models is achieved by utilizing the GRAVA agent architecture that uses Minimum Description Length as the basis for information fusion across semantic levels. A system is developed that reads in image data and outputs plausible interpretations of the Vindolanda tablets
Exploring manuscripts: sharing ancient wisdoms across the semantic web
Recent work in digital humanities has seen researchers in-creasingly producing online editions of texts and manuscripts, particularly in adoption of the TEI XML format for online publishing. The benefits of semantic web techniques are un-derexplored in such research, however, with a lack of sharing and communication of research information. The Sharing Ancient Wisdoms (SAWS) project applies linked data prac-tices to enhance and expand on what is possible with these digital text editions. Focussing on Greek and Arabic col-lections of ancient wise sayings, which are often related to each other, we use RDF to annotate and extract seman-tic information from the TEI documents as RDF triples. This allows researchers to explore the conceptual networks that arise from these interconnected sayings. The SAWS project advocates a semantic-web-based methodology, en-hancing rather than replacing current workflow processes, for digital humanities researchers to share their findings and collectively benefit from each other’s work
A Multiple-Expert Binarization Framework for Multispectral Images
In this work, a multiple-expert binarization framework for multispectral
images is proposed. The framework is based on a constrained subspace selection
limited to the spectral bands combined with state-of-the-art gray-level
binarization methods. The framework uses a binarization wrapper to enhance the
performance of the gray-level binarization. Nonlinear preprocessing of the
individual spectral bands is used to enhance the textual information. An
evolutionary optimizer is considered to obtain the optimal and some suboptimal
3-band subspaces from which an ensemble of experts is then formed. The
framework is applied to a ground truth multispectral dataset with promising
results. In addition, a generalization to the cross-validation approach is
developed that not only evaluates generalizability of the framework, it also
provides a practical instance of the selected experts that could be then
applied to unseen inputs despite the small size of the given ground truth
dataset.Comment: 12 pages, 8 figures, 6 tables. Presented at ICDAR'1
- …