378,067 research outputs found

    A Review on Improve Handwritten character recognition by using Convolutional Neural Network

    Get PDF
    For image recognition CNN is the most popular learning model. The features like weight sharing strategy and strong relations of the pixels of the image makes CNN best choice for image recognition. The feature extraction and classification can be done simultaneously in deep learning models which has proved very needful compared to the traditional methods. A promising recognition can be obtained by using CNN if we address to certain issues. So in CNN based framework for handwritten character recognition that gives a better performance compared to other CNN based recognition methods

    CharFormer: A Glyph Fusion based Attentive Framework for High-precision Character Image Denoising

    Full text link
    Degraded images commonly exist in the general sources of character images, leading to unsatisfactory character recognition results. Existing methods have dedicated efforts to restoring degraded character images. However, the denoising results obtained by these methods do not appear to improve character recognition performance. This is mainly because current methods only focus on pixel-level information and ignore critical features of a character, such as its glyph, resulting in character-glyph damage during the denoising process. In this paper, we introduce a novel generic framework based on glyph fusion and attention mechanisms, i.e., CharFormer, for precisely recovering character images without changing their inherent glyphs. Unlike existing frameworks, CharFormer introduces a parallel target task for capturing additional information and injecting it into the image denoising backbone, which will maintain the consistency of character glyphs during character image denoising. Moreover, we utilize attention-based networks for global-local feature interaction, which will help to deal with blind denoising and enhance denoising performance. We compare CharFormer with state-of-the-art methods on multiple datasets. The experimental results show the superiority of CharFormer quantitatively and qualitatively.Comment: Accepted by ACM MM 202

    Container Number Recognition Method Based on SSD_MobileNet and SVM

    Get PDF
    Aiming at how to realize the recognition of the container number on the container surface at the entrance and exit of the port, a method based on image affine transformation and SVM classifier is proposed. The main process includes truck target detection, box number area detection, text correction stage, image preprocessing stage and segmentation detection and recognition stage. Firstly, a kind of container truck detection program based on frame difference method and decreasing sequence of connected domain is proposed; secondly, a method of container number area detection based on SSD_MobileNet is proposed; in the case number recognition stage, a text correction method based on image affine transformation is proposed, and different processing methods are proposed for vertical sequence box number and horizontal sequence box number in image preprocessing stage In the stage of segmentation detection and recognition, a character segmentation algorithm based on connected domain segmentation and a segmentation detection and recognition algorithm based on SVM classifier are proposed. Through the detection and recognition of container images in the field monitoring video, the accuracy rate of regional detection can reach 97%, and the accuracy rate of character recognition can reach 95%, and it can achieve good real-time performance

    A study of holistic strategies for the recognition of characters in natural scene images

    Get PDF
    Recognition and understanding of text in scene images is an important and challenging task. The importance can be seen in the context of tasks such as assisted navigation for the blind, providing directions to driverless cars, e.g. Google car, etc. Other applications include automated document archival services, mining text from images, and so on. The challenge comes from a variety of factors, like variable typefaces, uncontrolled imaging conditions, and various sources of noise corrupting the captured images. In this work, we study and address the fundamental problem of recognition of characters extracted from natural scene images, and contribute three holistic strategies to deal with this challenging task. Scene text recognition (STR) has been a known problem in computer vision and pattern recognition community for over two decades, and is still an active area of research owing to the fact that the recognition performance has still got a lot of room for improvement. Recognition of characters lies at the heart of STR and is a crucial component for a reliable STR system. Most of the current methods heavily rely on discriminative power of local features, such as histograms of oriented gradient (HoG), scale invariant feature transform (SIFT), shape contexts (SC), geometric blur (GB), etc. One of the problems with such methods is that the local features are rasterized in an ad hoc manner to get a single vector for subsequent use in recognition. This rearrangement of features clearly perturbs the spatial correlations that may carry crucial information vis-á-vis recognition. Moreover, such approaches, in general, do not take into account the rotational invariance property that often leads to failed recognition in cases where characters in scene images do not occur in upright position. To eliminate this local feature dependency and the associated problems, we propose the following three holistic solutions: The first one is based on modelling character images of a class as a 3-mode tensor and then factoring it into a set of rank-1 matrices and the associated mixing coefficients. Each set of rank-1 matrices spans the solution subspace of a specific image class and enables us to capture the required holistic signature for each character class along with the mixing coefficients associated with each character image. During recognition, we project each test image onto the candidate subspaces to derive its mixing coefficients, which are eventually used for final classification. The second approach we study in this work lets us form a novel holistic feature for character recognition based on active contour model, also known as snakes. Our feature vector is based on two variables, direction and distance, cumulatively traversed by each point as the initial circular contour evolves under the force field induced by the character image. The initial contour design in conjunction with cross-correlation based similarity metric enables us to account for rotational variance in the character image. Our third approach is based on modelling a 3-mode tensor via rotation of a single image. This is different from our tensor based approach described above in that we form the tensor using a single image instead of collecting a specific number of samples of a particular class. In this case, to generate a 3D image cube, we rotate an image through a predefined range of angles. This enables us to explicitly capture rotational variance and leads to better performance than various local approaches. Finally, as an application, we use our holistic model to recognize word images extracted from natural scenes. Here we first use our novel word segmentation method based on image seam analysis to split a scene word into individual character images. We then apply our holistic model to recognize individual letters and use a spell-checker module to get the final word prediction. Throughout our work, we employ popular scene text datasets, like Chars74K-Font, Chars74K-Image, SVT, and ICDAR03, which include synthetic and natural image sets, to test the performance of our strategies. We compare results of our recognition models with several baseline methods and show comparable or better performance than several local feature-based methods justifying thus the importance of holistic strategies

    Template Neural Particle Optimization For Vehicle License Plate Recognition

    Get PDF
    The need for vehicle recognition has emerged from cases such as security, smart toll collections and traffic monitoring systems. This type of applications produces high demands especially on the accuracy of license plate recognition (LPR). The challenge of LPR is to select the best method for recognizing characters. Since the importance of LPR arises over times, there is a need to find the best alternative to overcome the problem. The detection and extraction of license plate is conventionally based on image processing methods. The image processing method in license plate recognition generally comprises of five stages including pre-processing, morphological operation, feature extraction, segmentation and character recognition. Pre-processing is an initial step in image processing to improve image quality for more suitability in visualizing perception or computational processing while filtering is required to solve contrast enhancement, noise suppression, blurry issue and data reduction. Feature extraction is applied to locate accurately the license plate position and segmentation is used to find and segment the isolated characters on the plates, without losing features of the characters. Finally, character recognition determines each character, identity and displays it into machine readable form. This study introduces five methods of character recognition namely template matching (TM), back-propagation neural network (BPNN), Particle Swarm Optimization neural network (PSONN), hybrid of TM with BPNN (TM-BPNN) and hybrid of TM with PSONN (TM-PSONN). PSONN is proposed as an alternative to train feed-forward neural network, while TM-BPNN and TM-PSONN are proposed to produce a better recognition result. The performance evaluation is carried out based on mean squared error, processing time, number of training iteration, correlation value and percentage of accuracy. The performance of the selected methods was analyzed by making use real images of 300 vehicles. The hybrid of TM-BPNN gives the highest recognition result with 94% accuracy, followed by the hybrid of TM-PSONN with 91.3%, TM with 77.3%, BPNN with 61.7% and lastly PSONN with 37.7%

    CG-DIQA: No-reference Document Image Quality Assessment Based on Character Gradient

    Full text link
    Document image quality assessment (DIQA) is an important and challenging problem in real applications. In order to predict the quality scores of document images, this paper proposes a novel no-reference DIQA method based on character gradient, where the OCR accuracy is used as a ground-truth quality metric. Character gradient is computed on character patches detected with the maximally stable extremal regions (MSER) based method. Character patches are essentially significant to character recognition and therefore suitable for use in estimating document image quality. Experiments on a benchmark dataset show that the proposed method outperforms the state-of-the-art methods in estimating the quality score of document images.Comment: To be published in Proc. of ICPR 201
    • …
    corecore