16 research outputs found
Persian Heritage Image Binarization Competition (PHIBC 2012)
The first competition on the binarization of historical Persian documents and
manuscripts (PHIBC 2012) has been organized in conjunction with the first
Iranian conference on pattern recognition and image analysis (PRIA 2013). The
main objective of PHIBC 2012 is to evaluate performance of the binarization
methodologies, when applied on the Persian heritage images. This paper provides
a report on the methodology and performance of the three submitted algorithms
based on evaluation measures has been used.Comment: 4 pages, 2 figures, conferenc
COCO_TS Dataset: Pixel-level Annotations Based on Weak Supervision for Scene Text Segmentation
The absence of large scale datasets with pixel-level supervisions is a
significant obstacle for the training of deep convolutional networks for scene
text segmentation. For this reason, synthetic data generation is normally
employed to enlarge the training dataset. Nonetheless, synthetic data cannot
reproduce the complexity and variability of natural images. In this paper, a
weakly supervised learning approach is used to reduce the shift between
training on real and synthetic data. Pixel-level supervisions for a text
detection dataset (i.e. where only bounding-box annotations are available) are
generated. In particular, the COCO-Text-Segmentation (COCO_TS) dataset, which
provides pixel-level supervisions for the COCO-Text dataset, is created and
released. The generated annotations are used to train a deep convolutional
neural network for semantic segmentation. Experiments show that the proposed
dataset can be used instead of synthetic data, allowing us to use only a
fraction of the training samples and significantly improving the performances
Automatic Document Image Binarization using Bayesian Optimization
Document image binarization is often a challenging task due to various forms
of degradation. Although there exist several binarization techniques in
literature, the binarized image is typically sensitive to control parameter
settings of the employed technique. This paper presents an automatic document
image binarization algorithm to segment the text from heavily degraded document
images. The proposed technique uses a two band-pass filtering approach for
background noise removal, and Bayesian optimization for automatic
hyperparameter selection for optimal results. The effectiveness of the proposed
binarization technique is empirically demonstrated on the Document Image
Binarization Competition (DIBCO) and the Handwritten Document Image
Binarization Competition (H-DIBCO) datasets
ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset
漏 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.[EN] This paper describes the Handwritten Text Recognition (HTR) competition on the READ dataset that has been held in the context of the International Conference on Frontiers in Handwriting Recognition 2016. This competition
aims to bring together researchers working on off-line HTR and provide them a suitable benchmark to compare their techniques on the task of transcribing typical historical handwritten documents. Two tracks with different conditions on
the use of training data were proposed. Ten research groups registered in the competition but finally five submitted results. The handwritten images for this competition were drawn from the German document Ratsprotokolle collection composed of minutes of the council meetings held from 1470 to 1805, used
in the READ project. The selected dataset is written by several hands and entails significant variabilities and difficulties. The five participants achieved good results with transcriptions word error rates ranging from 21% to 47% and character error rates rating from 5% to 19%.This work has been partially supported through the European Union's H2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), and the MINECO/FEDER UE project TIN2015-70924-C2-1-R.S谩nchez Peir贸, JA.; Romero G贸mez, V.; Toselli, AH.; Vidal, E. (2016). ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset. IEEE. https://doi.org/10.1109/ICFHR.2016.0120
An Efficient Phase-Based Binarization Method for Degraded Historical Documents
Document image binarization is the first essential step in digitalizing images and is considered an essential technique in both document image analysis applications and optical character recognition operations, the binarization process is used to obtain a binary image from the original image, binary image is the proper presentation for image segmentation, recognition, and restoration as underlined by several studies which assure that the next step of document image analysis applications depends on the binarization result. 聽However, old and historical document images mainly suffering from several types of degradations, such as bleeding through the blur, uneven illumination and other types of degradations which makes the binarization process a difficult task. Therefore, extracting of foreground from a degraded background relies on the degradation, furthermore it also depends on the type of used paper and document age. Developed binarization methods are necessary to decrease the impact of the degradation in document background. To resolve this difficulty, this paper proposes an effective, enhanced binarization technique for degraded and historical document images. The proposed method is based on enhancing an existing binarization method by modifying parameters and adding a post-processing stage, thus improving the resulting binary images. This proposed technique is also robust, as there is no need for parameter tuning. After using document image binarization Contest (DIBCO) datasets to evaluate this proposed technique, our findings show that the proposed method efficiency is promising, producing better results than those obtained by some of the winners in the DIBCO
Scene text segmentation based on thresholding
This research deals with the problem of text segmentation in scene images. Introduction deals with the information contained in an image and the different properties that will be useful for image segmentation. After that, the process of extraction of textual information is explained step by step. Furthermore, the problem of scene text segmentation is described more precisely and an overview of more popular existing methods is given.
Text segmentation method is created and implemented using C++ programming language with OpenCV library. Finally, algorithm is evaluated with images from ICDAR 2013 test dataset
A selectional auto-encoder approach for document image binarization
Binarization plays a key role in the automatic information retrieval from document images. This process is usually performed in the first stages of document analysis systems, and serves as a basis for subsequent steps. Hence it has to be robust in order to allow the full analysis workflow to be successful. Several methods for document image binarization have been proposed so far, most of which are based on hand-crafted image processing strategies. Recently, Convolutional Neural Networks have shown an amazing performance in many disparate duties related to computer vision. In this paper we discuss the use of convolutional auto-encoders devoted to learning an end-to-end map from an input image to its selectional output, in which activations indicate the likelihood of pixels to be either foreground or background. Once trained, documents can therefore be binarized by parsing them through the model and applying a global threshold. This approach has proven to outperform existing binarization strategies in a number of document types.This work was partially supported by the Social Sciences and Humanities Research Council of Canada, the Spanish Ministerio de Ciencia, Innovaci贸n y Universidades through Juan de la Cierva - Formaci贸n grant (Ref. FJCI-2016-27873), and the Universidad de Alicante through grant GRE-16-04
CT-Net:Cascade T-shape deep fusion networks for document binarization
Document binarization is a key step in most document analysis tasks. However, historical-document images usually suffer from various degradations, making this a very challenging processing stage. The performance of document image binarization has improved dramatically in recent years by the use of Convolutional Neural Networks (CNNs). In this paper, a dual-task, T-shaped neural network is proposed that has the main task of binarization and an auxiliary task of image enhancement. The neural network for enhancement learns the degradations in document images and the specific CNN-kernel features can be adapted towards the binarization task in the training process. In addition, the enhancement image can be considered as an improved version of the input image, which can be fed into the network for fine-tuning, making it possible to design a chained-cascade network (CT-Net). Experimental results on document binarization competition datasets (DIBCO datasets) and MCS dataset show that our proposed method outperforms competing state-of-the-art methods in most cases