1,277 research outputs found

    Automatic Document Image Binarization using Bayesian Optimization

    Full text link
    Document image binarization is often a challenging task due to various forms of degradation. Although there exist several binarization techniques in literature, the binarized image is typically sensitive to control parameter settings of the employed technique. This paper presents an automatic document image binarization algorithm to segment the text from heavily degraded document images. The proposed technique uses a two band-pass filtering approach for background noise removal, and Bayesian optimization for automatic hyperparameter selection for optimal results. The effectiveness of the proposed binarization technique is empirically demonstrated on the Document Image Binarization Competition (DIBCO) and the Handwritten Document Image Binarization Competition (H-DIBCO) datasets

    Computationally Efficient Implementation of Convolution-based Locally Adaptive Binarization Techniques

    Full text link
    One of the most important steps of document image processing is binarization. The computational requirements of locally adaptive binarization techniques make them unsuitable for devices with limited computing facilities. In this paper, we have presented a computationally efficient implementation of convolution based locally adaptive binarization techniques keeping the performance comparable to the original implementation. The computational complexity has been reduced from O(W2N2) to O(WN2) where WxW is the window size and NxN is the image size. Experiments over benchmark datasets show that the computation time has been reduced by 5 to 15 times depending on the window size while memory consumption remains the same with respect to the state-of-the-art algorithmic implementation

    DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning

    Get PDF
    This paper presents a novel iterative deep learning framework and apply it for document enhancement and binarization. Unlike the traditional methods which predict the binary label of each pixel on the input image, we train the neural network to learn the degradations in document images and produce the uniform images of the degraded input images, which allows the network to refine the output iteratively. Two different iterative methods have been studied in this paper: recurrent refinement (RR) which uses the same trained neural network in each iteration for document enhancement and stacked refinement (SR) which uses a stack of different neural networks for iterative output refinement. Given the learned uniform and enhanced image, the binarization map can be easy to obtain by a global or local threshold. The experimental results on several public benchmark data sets show that our proposed methods provide a new clean version of the degraded image which is suitable for visualization and promising results of binarization using the global Otsu's threshold based on the enhanced images learned iteratively by the neural network.Comment: Accepted by Pattern Recognitio

    Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

    Get PDF
    Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance

    Computationally efficient vessel classification using shallow neural networks on SAR data

    Get PDF
    O radar de abertura sintética (SAR) ´e um radar ativo montado em uma plataforma em movimento, que simula um comprimento de antena maior do que o comprimento real da antena física. De forma semelhante ao radar convencional, ondas eletromagnéticas são transmitidas sequencialmente e os ecos são coletados pelo radar. Com o devido processamento de sinal, este tipo de sistema ´e capaz de fornecer imagens de micro-ondas de alta resolução de uma área-alvo desejada, em praticamente todas as condições meteorológicas. Atualmente, os sistemas SAR tem sido amplamente utilizados para a deteção remota possuindo várias aplicações, como observação da superfície terrestre, cartografia e aplicações militares. Dado que ´e independente do clima e pode operar tanto de dia quanto de noite, o SAR pode ser uma fonte mais confiável quando comparado com imagens ´óticas [1]. A deteção e reconhecimento de navios em imagens SAR tornou-se um tópico importante de pesquisa nos últimos anos. Esta tese apresenta um algoritmo computacionalmente eficiente para a classificação de embarcações em imagens de SAR usando Redes Neuronais com um número reduzido de camadas, também conhecidas como shallow neural networks. A utilização de shallow networks para a classificação de embarcações será dividida em duas etapas: extração de características e classificação. A extração de características tem como objetivo reduzir a carga computacional que as deep neural networks causam nos recursos computacionais, extraindo antecipadamente características-chave da imagem SAR. Os baixos requisitos computacionais tornam esta implementação compatível com sistemas a bordo de navios e aplicações em tempo real. A classificação ´e realizada usando uma rede neural com um número reduzido de camadas, que utiliza parâmetros obtidos a partir de algoritmos de extração de características para classificar a embarcação presente na imagem de radar. O processo de extração de características processa dados do conjunto de dados Open SAR ship [2] para obter várias características da embarcação, como comprimento, largura, média, desvio padrão e o número de pontos de dispersão presentes na embarcação.Synthetic aperture radar (SAR) is an active radar that is mounted on a moving platform, simulating a longer antenna length than the physical antenna real length. Similar to a conventional radar, electromagnetic waves are sequentially transmitted and the backscattered echoes are collected by the radar. With the proper signal processing, this kind of system is able to provide high resolution microwave images of a desired target area by synthesising a larger antenna aperture, in virtually all-weather conditions. Nowadays SAR systems have been extensively used for remote sensing. It has various applications such as Earth surface monitoring, charting and militar applications. Since it is weather independent and is able to operate whether it is day or night, SAR can be a more reliable source when compared with optical imagery [1]. Ship detection and recognition in SAR images has become an importante topic in research in recent years. This thesis presents a computationally eficiente algorithm for the classification of vessels in SAR images using Neural Networks (NN) with a reduced number of hidden layers, also called Shallow Neural Networks (SNN). Herein the use of SNN for vessel classification will be divided into two main steps: feature extraction and classification. Feature extraction aims to lessen the burden deep neural networks cause on computational resources by extracting key features beforehand from the SAR image. The low computational requirements make this implementation compatible with onboard vessel systems and real time applications. The classification is implemented using a SNN that uses parameters obtained from feature extraction algorithms to classify the vessel present in the radar image. In this thesis feature extraction processes data from the Open SAR Ship dataset [2] in order to obtain the vessel’s various features, such as ship length, width, mean, standard deviation and the number of scatter points present on the vessel.N/
    corecore