107 research outputs found

    EFFICIENT IMAGE COMPRESSION AND DECOMPRESSION ALGORITHMS FOR OCR SYSTEMS

    Get PDF
    This paper presents an efficient new image compression and decompression methods for document images, intended for usage in the pre-processing stage of an OCR system designed for needs of the “Nikola Tesla Museum” in Belgrade. Proposed image compression methods exploit the Run-Length Encoding (RLE) algorithm and an algorithm based on document character contour extraction, while an iterative scanline fill algorithm is used for image decompression. Image compression and decompression methods are compared with JBIG2 and JPEG2000 image compression standards. Segmentation accuracy results for ground-truth documents are obtained in order to evaluate the proposed methods. Results show that the proposed methods outperform JBIG2 compression regarding the time complexity, providing up to 25 times lower processing time at the expense of worse compression ratio results, as well as JPEG2000 image compression standard, providing up to 4-fold improvement in compression ratio. Finally, time complexity results show that the presented methods are sufficiently fast for a real time character segmentation system

    OCR for TIFF Compressed Document Images Directly in Compressed Domain Using Text segmentation and Hidden Markov Model

    Full text link
    In today's technological era, document images play an important and integral part in our day to day life, and specifically with the surge of Covid-19, digitally scanned documents have become key source of communication, thus avoiding any sort of infection through physical contact. Storage and transmission of scanned document images is a very memory intensive task, hence compression techniques are being used to reduce the image size before archival and transmission. To extract information or to operate on the compressed images, we have two ways of doing it. The first way is to decompress the image and operate on it and subsequently compress it again for the efficiency of storage and transmission. The other way is to use the characteristics of the underlying compression algorithm to directly process the images in their compressed form without involving decompression and re-compression. In this paper, we propose a novel idea of developing an OCR for CCITT (The International Telegraph and Telephone Consultative Committee) compressed machine printed TIFF document images directly in the compressed domain. After segmenting text regions into lines and words, HMM is applied for recognition using three coding modes of CCITT- horizontal, vertical and the pass mode. Experimental results show that OCR on pass modes give a promising results.Comment: The paper has 14 figures and 1 tabl

    Application of Stochastic Diffusion for Hiding High Fidelity Encrypted Images

    Get PDF
    Cryptography coupled with information hiding has received increased attention in recent years and has become a major research theme because of the importance of protecting encrypted information in any Electronic Data Interchange system in a way that is both discrete and covert. One of the essential limitations in any cryptography system is that the encrypted data provides an indication on its importance which arouses suspicion and makes it vulnerable to attack. Information hiding of Steganography provides a potential solution to this issue by making the data imperceptible, the security of the hidden information being a threat only if its existence is detected through Steganalysis. This paper focuses on a study methods for hiding encrypted information, specifically, methods that encrypt data before embedding in host data where the ‘data’ is in the form of a full colour digital image. Such methods provide a greater level of data security especially when the information is to be submitted over the Internet, for example, since a potential attacker needs to first detect, then extract and then decrypt the embedded data in order to recover the original information. After providing an extensive survey of the current methods available, we present a new method of encrypting and then hiding full colour images in three full colour host images with out loss of fidelity following data extraction and decryption. The application of this technique, which is based on a technique called ‘Stochastic Diffusion’ are wide ranging and include covert image information interchange, digital image authentication, video authentication, copyright protection and digital rights management of image data in general

    MINHLP: Module to Identify New Hampshire License Plates

    Get PDF
    A license plate, referred to simply as a plate or vehicle registration plate, is a small plastic or metal plate attached to a motor vehicle for official identification purposes. Most governments require a registration plate to be attached to both the front and rear of a vehicle, although certain jurisdictions or vehicle types, such as motorcycles, require only one plate, which is usually attached to the rear of the vehicle. We present analysis of Automatic License Plate Recognition (ALPR) of New Hampshire (NH) plates using open source products. This thesis contains an implementation of a demonstrated model and analysis of the results. In this paper, OpenCV (computer vision library) and Tesseract (open source optical character reader) is presented as a core intelligent infrastructure. The thesis explains the mathematical principles and algorithms used for number plate detection, processes of proper characters segmentation, normalization and recognition. A description of the challenges involved in detecting and reading license plate in NH, previous studies done by others and the strategies adopted to solve them is also given

    Application and Theory of Multimedia Signal Processing Using Machine Learning or Advanced Methods

    Get PDF
    This Special Issue is a book composed by collecting documents published through peer review on the research of various advanced technologies related to applications and theories of signal processing for multimedia systems using ML or advanced methods. Multimedia signals include image, video, audio, character recognition and optimization of communication channels for networks. The specific contents included in this book are data hiding, encryption, object detection, image classification, and character recognition. Academics and colleagues who are interested in these topics will find it interesting to read

    Text detection and recognition in images and video sequences

    Get PDF
    Text characters embedded in images and video sequences represents a rich source of information for content-based indexing and retrieval applications. However, these text characters are difficult to be detected and recognized due to their various sizes, grayscale values and complex backgrounds. This thesis investigates methods for building an efficient application system for detecting and recognizing text of any grayscale values embedded in images and video sequences. Both empirical image processing methods and statistical machine learning and modeling approaches are studied in two sub-problems: text detection and text recognition. Applying machine learning methods for text detection encounters difficulties due to character size, grayscale variations and heavy computation cost. To overcome these problems, we propose a two-step localization/verification approach. The first step aims at quickly localizing candidate text lines, enabling the normalization of characters into a unique size. In the verification step, a trained support vector machine or multi-layer perceptrons is applied on background independent features to remove the false alarms. Text recognition, even from the detected text lines, remains a challenging problem due to the variety of fonts, colors, the presence of complex backgrounds and the short length of the text strings. Two schemes are investigated addressing the text recognition problem: bi-modal enhancement scheme and multi-modal segmentation scheme. In the bi-modal scheme, we propose a set of filters to enhance the contrast of black and white characters and produce a better binarization before recognition. For more general cases, the text recognition is addressed by a text segmentation step followed by a traditional optical character recognition (OCR) algorithm within a multi-hypotheses framework. In the segmentation step, we model the distribution of grayscale values of pixels using a Gaussian mixture model or a Markov Random Field. The resulting multiple segmentation hypotheses are post-processed by a connected component analysis and a grayscale consistency constraint algorithm. Finally, they are processed by an OCR software. A selection algorithm based on language modeling and OCR statistics chooses the text result from all the produced text strings. Additionally, methods for using temporal information of video text are investigated. A Monte Carlo video text segmentation method is proposed for adapting the segmentation parameters along temporal text frames. Furthermore, a ROVER (Recognizer Output Voting Error Reduction) algorithm is studied for improving the final recognition text string by voting the characters through temporal frames

    Building the big picture : enhanced resolution from coding

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1994.Includes bibliographical references (leaves 105-107).by Roger George Kermode.M.S
    corecore