29 research outputs found
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
This paper presents a novel iterative deep learning framework and apply it
for document enhancement and binarization. Unlike the traditional methods which
predict the binary label of each pixel on the input image, we train the neural
network to learn the degradations in document images and produce the uniform
images of the degraded input images, which allows the network to refine the
output iteratively. Two different iterative methods have been studied in this
paper: recurrent refinement (RR) which uses the same trained neural network in
each iteration for document enhancement and stacked refinement (SR) which uses
a stack of different neural networks for iterative output refinement. Given the
learned uniform and enhanced image, the binarization map can be easy to obtain
by a global or local threshold. The experimental results on several public
benchmark data sets show that our proposed methods provide a new clean version
of the degraded image which is suitable for visualization and promising results
of binarization using the global Otsu's threshold based on the enhanced images
learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio
CCDWT-GAN: Generative Adversarial Networks Based on Color Channel Using Discrete Wavelet Transform for Document Image Binarization
To efficiently extract the textual information from color degraded document
images is an important research topic. Long-term imperfect preservation of
ancient documents has led to various types of degradation such as page
staining, paper yellowing, and ink bleeding; these degradations badly impact
the image processing for information extraction. In this paper, we present
CCDWT-GAN, a generative adversarial network (GAN) that utilizes the discrete
wavelet transform (DWT) on RGB (red, green, blue) channel splited images. The
proposed method comprises three stages: image preprocessing, image enhancement,
and image binarization. This work conducts comparative experiments in the image
preprocessing stage to determine the optimal selection of DWT with
normalization. Additionally, we perform an ablation study on the results of the
image enhancement stage and the image binarization stage to validate their
positive effect on the model performance. This work compares the performance of
the proposed method with other state-of-the-art (SOTA) methods on DIBCO and
H-DIBCO ((Handwritten) Document Image Binarization Competition) datasets. The
experimental results demonstrate that CCDWT-GAN achieves a top two performance
on multiple benchmark datasets, and outperforms other SOTA methods
CT-Net:Cascade T-shape deep fusion networks for document binarization
Document binarization is a key step in most document analysis tasks. However, historical-document images usually suffer from various degradations, making this a very challenging processing stage. The performance of document image binarization has improved dramatically in recent years by the use of Convolutional Neural Networks (CNNs). In this paper, a dual-task, T-shaped neural network is proposed that has the main task of binarization and an auxiliary task of image enhancement. The neural network for enhancement learns the degradations in document images and the specific CNN-kernel features can be adapted towards the binarization task in the training process. In addition, the enhancement image can be considered as an improved version of the input image, which can be fed into the network for fine-tuning, making it possible to design a chained-cascade network (CT-Net). Experimental results on document binarization competition datasets (DIBCO datasets) and MCS dataset show that our proposed method outperforms competing state-of-the-art methods in most cases
Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks
The efficient segmentation of foreground text information from the background
in degraded color document images is a hot research topic. Due to the imperfect
preservation of ancient documents over a long period of time, various types of
degradation, including staining, yellowing, and ink seepage, have seriously
affected the results of image binarization. In this paper, a three-stage method
is proposed for image enhancement and binarization of degraded color document
images by using discrete wavelet transform (DWT) and generative adversarial
network (GAN). In Stage-1, we use DWT and retain the LL subband images to
achieve the image enhancement. In Stage-2, the original input image is split
into four (Red, Green, Blue and Gray) single-channel images, each of which
trains the independent adversarial networks. The trained adversarial network
models are used to extract the color foreground information from the images. In
Stage-3, in order to combine global and local features, the output image from
Stage-2 and the original input image are used to train the independent
adversarial networks for document binarization. The experimental results
demonstrate that our proposed method outperforms many classical and
state-of-the-art (SOTA) methods on the Document Image Binarization Contest
(DIBCO) dataset. We release our implementation code at
https://github.com/abcpp12383/ThreeStageBinarization