Search CORE

833 research outputs found

Techniques for document image processing in compressed domain

Author: Deng Shulan
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2008
Field of study

The main objective for image compression is usually considered the minimization of storage space. However, as the need to frequently access images increases, it is becoming more important for people to process the compressed representation directly. In this work, the techniques that can be applied directly and efficiently to digital information encoded by a given compression algorithm are investigated. Lossless compression schemes and information processing algorithms for binary document images and text data are two closely related areas bridged together by the fast processing of coded data. The compressed domains, which have been addressed in this work, i.e., the ITU fax standards and JBIG standard, are two major schemes used for document compression. Based on ITU Group IV, a modified coding scheme, MG4, which explores the 2-dimensional correlation between scan lines, is developed. From the viewpoints of compression efficiency and processing flexibility of image operations, the MG4 coding principle and its feature-preserving behavior in the compressed domain are investigated and examined. Two popular coding schemes in the area of bi-level image compression, run-length and Group IV, are studied and compared with MG4 in the three aspects of compression complexity, compression ratio, and feasibility of compressed-domain algorithms. In particular, for the operations of connected component extraction, skew detection, and rotation, MG4 shows a significant speed advantage over conventional algorithms. Some useful techniques for processing the JBIG encoded images directly in the compressed domain, or concurrently while they are being decoded, are proposed and generalized; In the second part of this work, the possibility of facilitating image processing in the wavelet transform domain is investigated. The textured images can be distinguished from each other by examining their wavelet transforms. The basic idea is that highly textured regions can be segmented using feature vectors extracted from high frequency bands based on the observation that textured images have large energies in both high and middle frequencies while images in which the grey level varies smoothly are heavily dominated by the low-frequency channels in the wavelet transform domain. As a result, a new method is developed and implemented to detect textures and abnormalities existing in document images by using polynomial wavelets. Segmentation experiments indicate that this approach is superior to other traditional methods in terms of memory space and processing time

University of Nevada, Las Vegas Repository

Probabilistic framework for image understanding applications using Bayesian Networks

Author: Jaber Mustafa
Publication venue: RIT Scholar Works
Publication date: 01/12/2011
Field of study

Machine learning algorithms have been successfully utilized in various systems/devices. They have the ability to improve the usability/quality of such systems in terms of intelligent user interface, fast performance, and more importantly, high accuracy. In this research, machine learning techniques are used in the field of image understanding, which is a common research area between image analysis and computer vision, to involve higher processing level of a target image to make sense of the scene captured in it. A general probabilistic framework for image understanding where topics associated with (i) collection of images to generate a comprehensive and valid database, (ii) generation of an unbiased ground-truth for the aforesaid database, (iii) selection of classification features and elimination of the redundant ones, and (iv) usage of such information to test a new sample set, are discussed. Two research projects have been developed as examples of the general image understanding framework; identification of region(s) of interest, and image segmentation evaluation. These techniques, in addition to others, are combined in an object-oriented rendering system for printing applications. The discussion included in this doctoral dissertation explores the means for developing such a system from an image understanding/ processing aspect. It is worth noticing that this work does not aim to develop a printing system. It is only proposed to add some essential features for current printing pipelines to achieve better visual quality while printing images/photos. Hence, we assume that image regions have been successfully extracted from the printed document. These images are used as input to the proposed object-oriented rendering algorithm where methodologies for color image segmentation, region-of-interest identification and semantic features extraction are employed. Probabilistic approaches based on Bayesian statistics have been utilized to develop the proposed image understanding techniques

RIT Scholar Works

Use of Perceptive Vision for Ruling Recognition in Ancient Documents

Author: Camillerapp Jean
Coüasnon Bertrand
Lemaitre Aurélie
Publication venue: Springer Berlin/heildeberg
Publication date: 01/01/2010
Field of study

Rulings are graphical primitives that are essential for document structure recognition. However in the case of ancient documents, bad printing techniques or bad conditions of conservation induce problems for their ecient recognition. Consequently, usual line segment extractors are not powerful enough to properly extract all the rulings of a heterogeneous document. In this paper, we propose a new method for ruling recognition, based on perceptive vision: we show that combining several levels of vision improves ruling recognition. Thus, it is possible to put forward hypothesis on the nature of the rulings at a given resolution, and to conrm or inrm their presence and nd their exact position at higher resolutions. We propose an original strategy of cooperation between resolutions and present tools to set up a correspondence between the elements extracted at each resolution. We validate this approach on images of ancient newspaper pages (dated between 1848 and 1944). We also propose to use the extracted rulings for the structure analysis of newspaper pages. We show that using more reliable extracted rulings simplies and improves document structure recognition

HAL-Rennes 1

A new approach to face recognition using Curvelet Transform

Author: Mandal Tanaya
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2008
Field of study

Multiresolution tools have been profusely employed in face recognition. Wavelet Transform is the best known among these multiresolution tools and is widely used for identification of human faces. Of late, following the success of wavelets a number of new multiresolution tools have been developed. Curvelet Transform is a recent addition to that list. It has better directional ability and effective curved edge representation capability. These two properties make curvelet transform a powerful weapon for extracting edge information from facial images. Our work aims at exploring the possibilities of curvelet transform for feature extraction from human faces in order to introduce a new alternative approach towards face recognition

Scholarship at UWindsor

DWT-CompCNN: Deep Image Classification Network for High Throughput JPEG 2000 Compressed Documents

Author: Bisen Tejasvee
Javed Mohammed
Kirtania Shashank
Nagabhushan P.
Publication venue
Publication date: 02/06/2023
Field of study

For any digital application with document images such as retrieval, the classification of document images becomes an essential stage. Conventionally for the purpose, the full versions of the documents, that is the uncompressed document images make the input dataset, which poses a threat due to the big volume required to accommodate the full versions of the documents. Therefore, it would be novel, if the same classification task could be accomplished directly (with some partial decompression) with the compressed representation of documents in order to make the whole process computationally more efficient. In this research work, a novel deep learning model, DWT CompCNN is proposed for classification of documents that are compressed using High Throughput JPEG 2000 (HTJ2K) algorithm. The proposed DWT-CompCNN comprises of five convolutional layers with filter sizes of 16, 32, 64, 128, and 256 consecutively for each increasing layer to improve learning from the wavelet coefficients extracted from the compressed images. Experiments are performed on two benchmark datasets- Tobacco-3482 and RVL-CDIP, which demonstrate that the proposed model is time and space efficient, and also achieves a better classification accuracy in compressed domain.Comment: In Springer Journal - Pattern Analysis and Applications under Minor Revisio

arXiv.org e-Print Archive

Development of a Recognizer for Bangla Text: Present Status and Future Challenges

Author: Hasan Sarwar
Mofizur Rahman
Nasreen Akter
Saima Hossain
Publication venue: 'IntechOpen'
Publication date: 17/08/2010
Field of study

IntechOpen

Wavelet-based image registration and segmentation framework for the quantitative evaluation of hydrocephalus

Author: Luo Fan
Publication venue: Halifax, N.S. : Saint Mary's University
Publication date: 01/01/2006
Field of study

xi, 100 leaves : ill. (some col.) ; 29 cm.Includes abstract.Includes bibliographical references (leaves 94-100).Hydrocephalus, a condition of increased fluid in the brain, is traditionally diagnosed by a visual assessment of CT scans. This thesis developed a quantitative measure of the change in ventricular volume over time. The framework includes: adaptive registration based on mutual information and wavelet multiresolution analysis, adaptive segmentation with a novel feature extraction method based on Dual-Tree Complex Wavelet Transform (DT-CWT) coefficients, and a volume calculation. The framework, when tested on physical phantoms had volume calculation accuracy of 1.0%. When tested on 8 clinical cases, the results reflected and predicted the diagnosis of the doctors, with less than 5% calculated volume change for cases where the diagnosis indicated the patient was stable, and more than 20% calculated volume change for cases for which hydrocephalus had been diagnosed. The outcome illustrated that the framework has good potential for development as a tool to aid in the diagnosis of hydrocephalus

Saint Mary's University, Halifax: Institutional Repository

Normal and abnormal tissue identification system and method for medical images such as digital mammograms

Author: Clarke Laurence P.
Cullers David Kent
Deans Stanley R.
Heine John J.
Stauduhar Richard Paul
Publication venue
Publication date: 30/10/2001
Field of study

A system and method for analyzing a medical image to determine whether an abnormality is present, for example, in digital mammograms, includes the application of a wavelet expansion to a raw image to obtain subspace images of varying resolution. At least one subspace image is selected that has a resolution commensurate with a desired predetermined detection resolution range. A functional form of a probability distribution function is determined for each selected subspace image, and an optimal statistical normal image region test is determined for each selected subspace image. A threshold level for the probability distribution function is established from the optimal statistical normal image region test for each selected subspace image. A region size comprising at least one sector is defined, and an output image is created that includes a combination of all regions for each selected subspace image. Each region has a first value when the region intensity level is above the threshold and a second value when the region intensity level is below the threshold. This permits the localization of a potential abnormality within the image

USFSP Digital Archive

NASA Technical Reports Server

Scholar Commons - University of South Florida