3 research outputs found
Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images
In recent years, a large number of binarization methods have been developed,
with varying performance generalization and strength against different
benchmarks. In this work, to leverage on these methods, an ensemble of experts
(EoE) framework is introduced, to efficiently combine the outputs of various
methods. The proposed framework offers a new selection process of the
binarization methods, which are actually the experts in the ensemble, by
introducing three concepts: confidentness, endorsement and schools of experts.
The framework, which is highly objective, is built based on two general
principles: (i) consolidation of saturated opinions and (ii) identification of
schools of experts. After building the endorsement graph of the ensemble for an
input document image based on the confidentness of the experts, the saturated
opinions are consolidated, and then the schools of experts are identified by
thresholding the consolidated endorsement graph. A variation of the framework,
in which no selection is made, is also introduced that combines the outputs of
all experts using endorsement-dependent weights. The EoE framework is evaluated
on the set of participating methods in the H-DIBCO'12 contest and also on an
ensemble generated from various instances of grid-based Sauvola method with
promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1
Automatic Document Image Binarization using Bayesian Optimization
Document image binarization is often a challenging task due to various forms
of degradation. Although there exist several binarization techniques in
literature, the binarized image is typically sensitive to control parameter
settings of the employed technique. This paper presents an automatic document
image binarization algorithm to segment the text from heavily degraded document
images. The proposed technique uses a two band-pass filtering approach for
background noise removal, and Bayesian optimization for automatic
hyperparameter selection for optimal results. The effectiveness of the proposed
binarization technique is empirically demonstrated on the Document Image
Binarization Competition (DIBCO) and the Handwritten Document Image
Binarization Competition (H-DIBCO) datasets