66 research outputs found
Automatic Document Image Binarization using Bayesian Optimization
Document image binarization is often a challenging task due to various forms
of degradation. Although there exist several binarization techniques in
literature, the binarized image is typically sensitive to control parameter
settings of the employed technique. This paper presents an automatic document
image binarization algorithm to segment the text from heavily degraded document
images. The proposed technique uses a two band-pass filtering approach for
background noise removal, and Bayesian optimization for automatic
hyperparameter selection for optimal results. The effectiveness of the proposed
binarization technique is empirically demonstrated on the Document Image
Binarization Competition (DIBCO) and the Handwritten Document Image
Binarization Competition (H-DIBCO) datasets
Learning Surrogate Models of Document Image Quality Metrics for Automated Document Image Processing
Computation of document image quality metrics often depends upon the
availability of a ground truth image corresponding to the document. This limits
the applicability of quality metrics in applications such as hyperparameter
optimization of image processing algorithms that operate on-the-fly on unseen
documents. This work proposes the use of surrogate models to learn the behavior
of a given document quality metric on existing datasets where ground truth
images are available. The trained surrogate model can later be used to predict
the metric value on previously unseen document images without requiring access
to ground truth images. The surrogate model is empirically evaluated on the
Document Image Binarization Competition (DIBCO) and the Handwritten Document
Image Binarization Competition (H-DIBCO) datasets
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
This paper presents a novel iterative deep learning framework and apply it
for document enhancement and binarization. Unlike the traditional methods which
predict the binary label of each pixel on the input image, we train the neural
network to learn the degradations in document images and produce the uniform
images of the degraded input images, which allows the network to refine the
output iteratively. Two different iterative methods have been studied in this
paper: recurrent refinement (RR) which uses the same trained neural network in
each iteration for document enhancement and stacked refinement (SR) which uses
a stack of different neural networks for iterative output refinement. Given the
learned uniform and enhanced image, the binarization map can be easy to obtain
by a global or local threshold. The experimental results on several public
benchmark data sets show that our proposed methods provide a new clean version
of the degraded image which is suitable for visualization and promising results
of binarization using the global Otsu's threshold based on the enhanced images
learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio
Computationally Efficient Implementation of Convolution-based Locally Adaptive Binarization Techniques
One of the most important steps of document image processing is binarization.
The computational requirements of locally adaptive binarization techniques make
them unsuitable for devices with limited computing facilities. In this paper,
we have presented a computationally efficient implementation of convolution
based locally adaptive binarization techniques keeping the performance
comparable to the original implementation. The computational complexity has
been reduced from O(W2N2) to O(WN2) where WxW is the window size and NxN is the
image size. Experiments over benchmark datasets show that the computation time
has been reduced by 5 to 15 times depending on the window size while memory
consumption remains the same with respect to the state-of-the-art algorithmic
implementation
Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images
In recent years, a large number of binarization methods have been developed,
with varying performance generalization and strength against different
benchmarks. In this work, to leverage on these methods, an ensemble of experts
(EoE) framework is introduced, to efficiently combine the outputs of various
methods. The proposed framework offers a new selection process of the
binarization methods, which are actually the experts in the ensemble, by
introducing three concepts: confidentness, endorsement and schools of experts.
The framework, which is highly objective, is built based on two general
principles: (i) consolidation of saturated opinions and (ii) identification of
schools of experts. After building the endorsement graph of the ensemble for an
input document image based on the confidentness of the experts, the saturated
opinions are consolidated, and then the schools of experts are identified by
thresholding the consolidated endorsement graph. A variation of the framework,
in which no selection is made, is also introduced that combines the outputs of
all experts using endorsement-dependent weights. The EoE framework is evaluated
on the set of participating methods in the H-DIBCO'12 contest and also on an
ensemble generated from various instances of grid-based Sauvola method with
promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1
Persian Heritage Image Binarization Competition (PHIBC 2012)
The first competition on the binarization of historical Persian documents and
manuscripts (PHIBC 2012) has been organized in conjunction with the first
Iranian conference on pattern recognition and image analysis (PRIA 2013). The
main objective of PHIBC 2012 is to evaluate performance of the binarization
methodologies, when applied on the Persian heritage images. This paper provides
a report on the methodology and performance of the three submitted algorithms
based on evaluation measures has been used.Comment: 4 pages, 2 figures, conferenc
CT-Net:Cascade T-shape deep fusion networks for document binarization
Document binarization is a key step in most document analysis tasks. However, historical-document images usually suffer from various degradations, making this a very challenging processing stage. The performance of document image binarization has improved dramatically in recent years by the use of Convolutional Neural Networks (CNNs). In this paper, a dual-task, T-shaped neural network is proposed that has the main task of binarization and an auxiliary task of image enhancement. The neural network for enhancement learns the degradations in document images and the specific CNN-kernel features can be adapted towards the binarization task in the training process. In addition, the enhancement image can be considered as an improved version of the input image, which can be fed into the network for fine-tuning, making it possible to design a chained-cascade network (CT-Net). Experimental results on document binarization competition datasets (DIBCO datasets) and MCS dataset show that our proposed method outperforms competing state-of-the-art methods in most cases
- …