21 research outputs found

    Ridgelet-based signature for natural image classification

    Get PDF
    This paper presents an approach to grouping natural scenes into (semantically) meaningful categories. The proposed approach exploits the statistics of natural scenes to define relevant image categories. A ridgelet-based signature is used to represent images. This signature is used by a support vector classifier that is well designed to support high dimensional features, resulting in an effective recognition system. As an illustration of the potential of the approach several experiments of binary classifications (e.g. city/landscape or indoor/outdoor) are conducted on databases of natural scenes

    The aceToolbox: low-level audiovisual feature extraction for retrieval and classification

    Get PDF
    In this paper we present an overview of a software platform that has been developed within the aceMedia project, termed the aceToolbox, that provides global and local lowlevel feature extraction from audio-visual content. The toolbox is based on the MPEG-7 eXperimental Model (XM), with extensions to provide descriptor extraction from arbitrarily shaped image segments, thereby supporting local descriptors reflecting real image content. We describe the architecture of the toolbox as well as providing an overview of the descriptors supported to date. We also briefly describe the segmentation algorithm provided. We then demonstrate the usefulness of the toolbox in the context of two different content processing scenarios: similarity-based retrieval in large collections and scene-level classification of still images

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations

    Pattern detection and recognition using over-complete and sparse representations

    Get PDF
    Recent research in harmonic analysis and mammalian vision systems has revealed that over-complete and sparse representations play an important role in visual information processing. The research on applying such representations to pattern recognition and detection problems has become an interesting field of study. The main contribution of this thesis is to propose two feature extraction strategies - the global strategy and the local strategy - to make use of these representations. In the global strategy, over-complete and sparse transformations are applied to the input pattern as a whole and features are extracted in the transformed domain. This strategy has been applied to the problems of rotation invariant texture classification and script identification, using the Ridgelet transform. Experimental results have shown that better performance has been achieved when compared with Gabor multi-channel filtering method and Wavelet based methods. The local strategy is divided into two stages. The first one is to analyze the local over-complete and sparse structure, where the input 2-D patterns are divided into patches and the local over-complete and sparse structure is learned from these patches using sparse approximation techniques. The second stage concerns the application of the local over-complete and sparse structure. For an object detection problem, we propose a sparsity testing technique, where a local over-complete and sparse structure is built to give sparse representations to the text patterns and non-sparse representations to other patterns. Object detection is achieved by identifying patterns that can be sparsely represented by the learned. structure. This technique has been applied. to detect texts in scene images with a recall rate of 75.23% (about 6% improvement compared with other works) and a precision rate of 67.64% (about 12% improvement). For applications like character or shape recognition, the learned over-complete and sparse structure is combined. with a Convolutional Neural Network (CNN). A second text detection method is proposed based on such a combination to further improve (about 11% higher compared with our first method based on sparsity testing) the accuracy of text detection in scene images. Finally, this method has been applied to handwritten Farsi numeral recognition, which has obtained a 99.22% recognition rate on the CENPARMI Database and a 99.5% recognition rate on the HODA Database. Meanwhile, a SVM with gradient features achieves recognition rates of 98.98% and 99.22% on these databases respectivel

    Design and analysis of a content-based image retrieval system

    Get PDF
    The automatic retrieval of images according to the similarity of their content is a challenging task with many application fields. In this book the automatic retrieval of images according to human spontaneous perception without further effort or knowledge is considered. A system is therefore designed and analyzed. Methods for the detection and extraction of regions and for the extraction and comparison of color, shape, and texture features are also investigated

    Color image quality measures and retrieval

    Get PDF
    The focus of this dissertation is mainly on color image, especially on the images with lossy compression. Issues related to color quantization, color correction, color image retrieval and color image quality evaluation are addressed. A no-reference color image quality index is proposed. A novel color correction method applied to low bit-rate JPEG image is developed. A novel method for content-based image retrieval based upon combined feature vectors of shape, texture, and color similarities has been suggested. In addition, an image specific color reduction method has been introduced, which allows a 24-bit JPEG image to be shown in the 8-bit color monitor with 256-color display. The reduction in download and decode time mainly comes from the smart encoder incorporating with the proposed color reduction method after color space conversion stage. To summarize, the methods that have been developed can be divided into two categories: one is visual representation, and the other is image quality measure. Three algorithms are designed for visual representation: (1) An image-based visual representation for color correction on low bit-rate JPEG images. Previous studies on color correction are mainly on color image calibration among devices. Little attention was paid to the compressed image whose color distortion is evident in low bit-rate JPEG images. In this dissertation, a lookup table algorithm is designed based on the loss of PSNR in different compression ratio. (2) A feature-based representation for content-based image retrieval. It is a concatenated vector of color, shape, and texture features from region of interest (ROI). (3) An image-specific 256 colors (8 bits) reproduction for color reduction from 16 millions colors (24 bits). By inserting the proposed color reduction method into a JPEG encoder, the image size could be further reduced and the transmission time is also reduced. This smart encoder enables its decoder using less time in decoding. Three algorithms are designed for image quality measure (IQM): (1) A referenced IQM based upon image representation in very low-dimension. Previous studies on IQMs are based on high-dimensional domain including spatial and frequency domains. In this dissertation, a low-dimensional domain IQM based on random projection is designed, with preservation of the IQM accuracy in high-dimensional domain. (2) A no-reference image blurring metric. Based on the edge gradient, the degree of image blur can be measured. (3) A no-reference color IQM based upon colorfulness, contrast and sharpness

    Instance segmentation and material classification in X-ray computed tomography

    Full text link
    Over the past thirty years, X-Ray Computed Tomography (CT) has been widely used in security checking due to its high resolution and fully 3-d construction. Designing object segmentation and classification algorithms based on reconstructed CT intensity data will help accurately locate and classify the potential hazardous articles in luggage. Proposal-based deep networks have been successful recently in segmentation and recognition tasks. However, they require large amount of labeled training images, which are hard to obtain in CT research. This thesis develops a non-proposal 3-d instance segmentation and classification structure based on smoothed fully convolutional networks (FCNs), graph-based spatial clustering and ensembling kernel SVMs using volumetric texture features, which can be trained on limited and highly unbalanced CT intensity data. Our structure will not only significantly accelerate the training convergence in FCN, but also efficiently detect and remove the outlier voxels in training data and guarantee the high and stable material classification performance. We demonstrate the performance of our approach on experimental volumetric images of containers obtained using a medical CT scanner

    Signal processing with Fourier analysis, novel algorithms and applications

    Get PDF
    Fourier analysis is the study of the way general functions may be represented or approximated by sums of simpler trigonometric functions, also analogously known as sinusoidal modeling. The original idea of Fourier had a profound impact on mathematical analysis, physics and engineering because it diagonalizes time-invariant convolution operators. In the past signal processing was a topic that stayed almost exclusively in electrical engineering, where only the experts could cancel noise, compress and reconstruct signals. Nowadays it is almost ubiquitous, as everyone now deals with modern digital signals. Medical imaging, wireless communications and power systems of the future will experience more data processing conditions and wider range of applications requirements than the systems of today. Such systems will require more powerful, efficient and flexible signal processing algorithms that are well designed to handle such needs. No matter how advanced our hardware technology becomes we will still need intelligent and efficient algorithms to address the growing demands in signal processing. In this thesis, we investigate novel techniques to solve a suite of four fundamental problems in signal processing that have a wide range of applications. The relevant equations, literature of signal processing applications, analysis and final numerical algorithms/methods to solve them using Fourier analysis are discussed for different applications in the electrical engineering/computer science. The first four chapters cover the following topics of central importance in the field of signal processing: • Fast Phasor Estimation using Adaptive Signal Processing (Chapter 2) • Frequency Estimation from Nonuniform Samples (Chapter 3) • 2D Polar and 3D Spherical Polar Nonuniform Discrete Fourier Transform (Chapter 4) • Robust 3D registration using Spherical Polar Discrete Fourier Transform and Spherical Harmonics (Chapter 5) Even though each of these four methods discussed may seem completely disparate, the underlying motivation for more efficient processing by exploiting the Fourier domain signal structure remains the same. The main contribution of this thesis is the innovation in the analysis, synthesis, discretization of certain well known problems like phasor estimation, frequency estimation, computations of a particular non-uniform Fourier transform and signal registration on the transformed domain. We conduct propositions and evaluations of certain applications relevant algorithms such as, frequency estimation algorithm using non-uniform sampling, polar and spherical polar Fourier transform. The techniques proposed are also useful in the field of computer vision and medical imaging. From a practical perspective, the proposed algorithms are shown to improve the existing solutions in the respective fields where they are applied/evaluated. The formulation and final proposition is shown to have a variety of benefits. Future work with potentials in medical imaging, directional wavelets, volume rendering, video/3D object classifications, high dimensional registration are also discussed in the final chapter. Finally, in the spirit of reproducible research we release the implementation of these algorithms to the public using Github

    Feature Fusion for Fingerprint Liveness Detection

    Get PDF
    For decades, fingerprints have been the most widely used biometric trait in identity recognition systems, thanks to their natural uniqueness, even in rare cases such as identical twins. Recently, we witnessed a growth in the use of fingerprint-based recognition systems in a large variety of devices and applications. This, as a consequence, increased the benefits for offenders capable of attacking these systems. One of the main issues with the current fingerprint authentication systems is that, even though they are quite accurate in terms of identity verification, they can be easily spoofed by presenting to the input sensor an artificial replica of the fingertip skin’s ridge-valley patterns. Due to the criticality of this threat, it is crucial to develop countermeasure methods capable of facing and preventing these kind of attacks. The most effective counter–spoofing methods are those trying to distinguish between a "live" and a "fake" fingerprint before it is actually submitted to the recognition system. According to the technology used, these methods are mainly divided into hardware and software-based systems. Hardware-based methods rely on extra sensors to gain more pieces of information regarding the vitality of the fingerprint owner. On the contrary, software-based methods merely rely on analyzing the fingerprint images acquired by the scanner. Software-based methods can then be further divided into dynamic, aimed at analyzing sequences of images to capture those vital signs typical of a real fingerprint, and static, which process a single fingerprint impression. Among these different approaches, static software-based methods come with three main benefits. First, they are cheaper, since they do not require the deployment of any additional sensor to perform liveness detection. Second, they are faster since the information they require is extracted from the same input image acquired for the identification task. Third, they are potentially capable of tackling novel forms of attack through an update of the software. The interest in this type of counter–spoofing methods is at the basis of this dissertation, which addresses the fingerprint liveness detection under a peculiar perspective, which stems from the following consideration. Generally speaking, this problem has been tackled in the literature with many different approaches. Most of them are based on first identifying the most suitable image features for the problem in analysis and, then, into developing some classification system based on them. In particular, most of the published methods rely on a single type of feature to perform this task. Each of this individual features can be more or less discriminative and often highlights some peculiar characteristics of the data in analysis, often complementary with that of other feature. Thus, one possible idea to improve the classification accuracy is to find effective ways to combine them, in order to mutually exploit their individual strengths and soften, at the same time, their weakness. However, such a "multi-view" approach has been relatively overlooked in the literature. Based on the latter observation, the first part of this work attempts to investigate proper feature fusion methods capable of improving the generalization and robustness of fingerprint liveness detection systems and enhance their classification strength. Then, in the second part, it approaches the feature fusion method in a different way, that is by first dividing the fingerprint image into smaller parts, then extracting an evidence about the liveness of each of these patches and, finally, combining all these pieces of information in order to take the final classification decision. The different approaches have been thoroughly analyzed and assessed by comparing their results (on a large number of datasets and using the same experimental protocol) with that of other works in the literature. The experimental results discussed in this dissertation show that the proposed approaches are capable of obtaining state–of–the–art results, thus demonstrating their effectiveness
    corecore