1,277 research outputs found
Automatic Document Image Binarization using Bayesian Optimization
Document image binarization is often a challenging task due to various forms
of degradation. Although there exist several binarization techniques in
literature, the binarized image is typically sensitive to control parameter
settings of the employed technique. This paper presents an automatic document
image binarization algorithm to segment the text from heavily degraded document
images. The proposed technique uses a two band-pass filtering approach for
background noise removal, and Bayesian optimization for automatic
hyperparameter selection for optimal results. The effectiveness of the proposed
binarization technique is empirically demonstrated on the Document Image
Binarization Competition (DIBCO) and the Handwritten Document Image
Binarization Competition (H-DIBCO) datasets
Computationally Efficient Implementation of Convolution-based Locally Adaptive Binarization Techniques
One of the most important steps of document image processing is binarization.
The computational requirements of locally adaptive binarization techniques make
them unsuitable for devices with limited computing facilities. In this paper,
we have presented a computationally efficient implementation of convolution
based locally adaptive binarization techniques keeping the performance
comparable to the original implementation. The computational complexity has
been reduced from O(W2N2) to O(WN2) where WxW is the window size and NxN is the
image size. Experiments over benchmark datasets show that the computation time
has been reduced by 5 to 15 times depending on the window size while memory
consumption remains the same with respect to the state-of-the-art algorithmic
implementation
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
This paper presents a novel iterative deep learning framework and apply it
for document enhancement and binarization. Unlike the traditional methods which
predict the binary label of each pixel on the input image, we train the neural
network to learn the degradations in document images and produce the uniform
images of the degraded input images, which allows the network to refine the
output iteratively. Two different iterative methods have been studied in this
paper: recurrent refinement (RR) which uses the same trained neural network in
each iteration for document enhancement and stacked refinement (SR) which uses
a stack of different neural networks for iterative output refinement. Given the
learned uniform and enhanced image, the binarization map can be easy to obtain
by a global or local threshold. The experimental results on several public
benchmark data sets show that our proposed methods provide a new clean version
of the degraded image which is suitable for visualization and promising results
of binarization using the global Otsu's threshold based on the enhanced images
learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio
Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
Recognizing scene text is a challenging problem, even more so than the
recognition of scanned documents. This problem has gained significant attention
from the computer vision community in recent years, and several methods based
on energy minimization frameworks and deep learning approaches have been
proposed. In this work, we focus on the energy minimization framework and
propose a model that exploits both bottom-up and top-down cues for recognizing
cropped words extracted from street images. The bottom-up cues are derived from
individual character detections from an image. We build a conditional random
field model on these detections to jointly model the strength of the detections
and the interactions between them. These interactions are top-down cues
obtained from a lexicon-based prior, i.e., language statistics. The optimal
word represented by the text image is obtained by minimizing the energy
function corresponding to the random field model. We evaluate our proposed
algorithm extensively on a number of cropped scene text benchmark datasets,
namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word,
and show better performance than comparable methods. We perform a rigorous
analysis of all the steps in our approach and analyze the results. We also show
that state-of-the-art convolutional neural network features can be integrated
in our framework to further improve the recognition performance
Computationally efficient vessel classification using shallow neural networks on SAR data
O radar de abertura sintética (SAR) ´e um radar ativo montado em uma plataforma em movimento, que simula um comprimento de antena maior do que o comprimento real da antena física. De forma semelhante ao radar convencional, ondas eletromagnéticas são transmitidas sequencialmente e os ecos são coletados pelo radar. Com o devido processamento de sinal, este tipo de sistema ´e capaz de fornecer imagens de micro-ondas de alta resolução de uma área-alvo desejada, em praticamente todas as condições meteorológicas. Atualmente, os sistemas SAR tem sido amplamente utilizados para a deteção remota possuindo várias aplicações, como observação da superfície terrestre, cartografia e aplicações militares. Dado que ´e independente do clima e pode operar tanto de dia quanto de noite, o SAR pode ser uma fonte mais confiável quando comparado com imagens ´óticas [1]. A deteção e reconhecimento de navios em imagens SAR tornou-se um tópico importante de pesquisa nos últimos anos. Esta tese apresenta um algoritmo computacionalmente eficiente para a classificação de embarcações em imagens de SAR usando Redes Neuronais com um número reduzido de camadas, também conhecidas como shallow neural networks. A utilização de shallow networks para a classificação de embarcações será dividida em duas etapas: extração de características e classificação. A extração de características tem como objetivo reduzir a carga computacional que as deep neural networks causam nos recursos computacionais, extraindo antecipadamente características-chave da imagem SAR. Os baixos requisitos computacionais tornam esta implementação compatível com sistemas a bordo de navios e aplicações em tempo real. A classificação ´e realizada usando uma rede neural com um número reduzido de camadas, que utiliza parâmetros obtidos a partir de algoritmos de extração de características para classificar a embarcação presente na imagem de radar. O processo de extração de características processa dados do conjunto de dados Open SAR ship [2] para obter várias características da embarcação, como comprimento, largura, média, desvio padrão e o número de pontos de dispersão presentes na embarcação.Synthetic aperture radar (SAR) is an active radar that is mounted on a moving platform, simulating a longer antenna length than the physical antenna real length. Similar to a conventional radar, electromagnetic waves are sequentially transmitted and the backscattered echoes are collected by the radar. With the proper signal processing, this kind of system is able to provide high resolution microwave images of a desired target area by synthesising a larger antenna aperture, in virtually all-weather conditions. Nowadays SAR systems have been extensively used for remote sensing. It has various applications such as Earth surface monitoring, charting and militar applications. Since it is weather independent and is able to operate whether it is day or night, SAR can be a more reliable source when compared with optical imagery [1]. Ship detection and recognition in SAR images has become an importante topic in research in recent years. This thesis presents a computationally eficiente algorithm for the classification of vessels in SAR images using Neural Networks (NN) with a reduced number of hidden layers, also called Shallow Neural Networks (SNN). Herein the use of SNN for vessel classification will be divided into two main steps: feature extraction and classification. Feature extraction aims to lessen the burden deep neural networks cause on computational resources by extracting key features beforehand from the SAR image. The low computational requirements make this implementation compatible with onboard vessel systems and real time applications. The classification is implemented using a SNN that uses parameters obtained from feature extraction algorithms to classify the vessel present in the radar image. In this thesis feature extraction processes data from the Open SAR Ship dataset [2] in order to obtain the vessel’s various features, such as ship length, width, mean, standard deviation and the number of scatter points present on the vessel.N/
- …