37 research outputs found
Modulation Domain Image Processing
The classical Fourier transform is the cornerstone of traditional linearsignal and image processing. The discrete Fourier transform (DFT) and thefast Fourier transform (FFT) in particular led toprofound changes during the later decades of the last century in howwe analyze and process 1D and multi-dimensional signals.The Fourier transform represents a signal as an infinite superpositionof stationary sinusoids each of which has constant amplitude and constantfrequency. However, many important practical signals such as radar returnsand seismic waves are inherently nonstationary. Hence, more complextechniques such as the windowed Fourier transform and the wavelet transformwere invented to better capture nonstationary properties of these signals.In this dissertation, I studied an alternative nonstationary representationfor images, the 2D AM-FM model. In contrast to thestationary nature of the classical Fourier representation, the AM-FM modelrepresents an image as a finite sum of smoothly varying amplitudesand smoothly varying frequencies. The model has been applied successfullyin image processing applications such as image segmentation, texture analysis,and target tracking. However, these applications are limitedto \emph{analysis}, meaning that the computed AM and FM functionsare used as features for signal processing tasks such as classificationand recognition. For synthesis applications, few attempts have been madeto synthesize the original image from the AM and FM components. Nevertheless,these attempts were unstable and the synthesized results contained artifacts.The main reason is that the perfect reconstruction AM-FM image model waseither unavailable or unstable. Here, I constructed the first functionalperfect reconstruction AM-FM image transform that paves the way for AM-FMimage synthesis applications. The transform enables intuitive nonlinearimage filter designs in the modulation domain. I showed that these filtersprovide important advantages relative to traditional linear translation invariant filters.This dissertation addresses image processing operations in the nonlinearnonstationary modulation domain. In the modulation domain, an image is modeledas a sum of nonstationary amplitude modulation (AM) functions andnonstationary frequency modulation (FM) functions. I developeda theoretical framework for high fidelity signal and image modeling in themodulation domain, constructed an invertible multi-dimensional AM-FMtransform (xAMFM), and investigated practical signal processing applicationsof the transform. After developing the xAMFM, I investigated new imageprocessing operations that apply directly to the transformed AM and FMfunctions in the modulation domain. In addition, I introduced twoclasses of modulation domain image filters. These filters produceperceptually motivated signal processing results that are difficult orimpossible to obtain with traditional linear processing or spatial domainnonlinear approaches. Finally, I proposed three extensions of the AM-FMtransform and applied them in image analysis applications.The main original contributions of this dissertation include the following.- I proposed a perfect reconstruction FM algorithm. I used aleast-squares approach to recover the phase signal from itsgradient. In order to allow perfect reconstruction of the phase function, Ienforced an initial condition on the reconstructed phase. The perfectreconstruction FM algorithm plays a critical role in theoverall AM-FM transform.- I constructed a perfect reconstruction multi-dimensional filterbankby modifying the classical steerable pyramid. This modified filterbankensures a true multi-scale multi-orientation signal decomposition. Such adecomposition is required for a perceptually meaningful AM-FM imagerepresentation.- I rotated the partial Hilbert transform to alleviate ripplingartifacts in the computed AM and FM functions. This adjustment results inartifact free filtering results in the modulation domain.- I proposed the modulation domain image filtering framework. Iconstructed two classes of modulation domain filters. I showed that themodulation domain filters outperform traditional linear shiftinvariant (LSI) filters qualitatively and quantitatively in applicationssuch as selective orientation filtering, selective frequency filtering,and fundamental geometric image transformations.- I provided extensions of the AM-FM transform for image decompositionproblems. I illustrated that the AM-FM approach can successfullydecompose an image into coherent components such as textureand structural components.- I investigated the relationship between the two prominentAM-FM computational models, namely the partial Hilbert transformapproach (pHT) and the monogenic signal. The established relationshiphelps unify these two AM-FM algorithms.This dissertation lays a theoretical foundation for future nonlinearmodulation domain image processing applications. For the first time, onecan apply modulation domain filters to images to obtain predictableresults. The design of modulation domain filters is intuitive and simple,yet these filters produce superior results compared to those of pixeldomain LSI filters. Moreover, this dissertation opens up other research problems.For instance, classical image applications such as image segmentation andedge detection can be re-formulated in the modulation domain setting.Modulation domain based perceptual image and video quality assessment andimage compression are important future application areas for the fundamentalrepresentation results developed in this dissertation
Directional edge and texture representations for image processing
An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations
Local Geometric Transformations in Image Analysis
The characterization of images by geometric features facilitates the precise analysis of the structures found in biological micrographs such as cells, proteins, or tissues. In this thesis, we study image representations that are adapted to local geometric transformations such as rotation, translation, and scaling, with a special emphasis on wavelet representations. In the first part of the thesis, our main interest is in the analysis of directional patterns and the estimation of their location and orientation. We explore steerable representations that correspond to the notion of rotation. Contrarily to classical pattern matching techniques, they have no need for an a priori discretization of the angle and for matching the filter to the image at each discretized direction. Instead, it is sufficient to apply the filtering only once. Then, the rotated filter for any arbitrary angle can be determined by a systematic and linear transformation of the initial filter. We derive the Cramér-Rao bounds for steerable filters. They allow us to select the best harmonics for the design of steerable detectors and to identify their optimal radial profile. We propose several ways to construct optimal representations and to build powerful and effective detector schemes; in particular, junctions of coinciding branches with local orientations. The basic idea of local transformability and the general principles that we utilize to design steerable wavelets can be applied to other geometric transformations. Accordingly, in the second part, we extend our framework to other transformation groups, with a particular interest in scaling. To construct representations in tune with a notion of local scale, we identify the possible solutions for scalable functions and give specific criteria for their applicability to wavelet schemes. Finally, we propose discrete wavelet frames that approximate a continuous wavelet transform. Based on these results, we present a novel wavelet-based image-analysis software that provides a fast and automatic detection of circular patterns, combined with a precise estimation of their size
New contributions in overcomplete image representations inspired from the functional architecture of the primary visual cortex = Nuevas contribuciones en representaciones sobrecompletas de imágenes inspiradas por la arquitectura funcional de la corteza visual primaria
The present thesis aims at investigating parallelisms between the functional architecture of primary visual areas and image processing methods. A first objective is to refine existing models of biological vision on the base of information theory statements and a second is to develop original solutions for image processing inspired from natural vision. The available data on visual systems contains physiological and psychophysical studies, Gestalt psychology and statistics on natural images The thesis is mostly centered in overcomplete representations (i.e. representations increasing the dimensionality of the data) for multiple reasons. First because they allow to overcome existing drawbacks of critically sampled transforms, second because biological vision models appear overcomplete and third because building efficient overcomplete representations raises challenging and actual mathematical problems, in particular the problem of sparse approximation. The thesis proposes first a self-invertible log-Gabor wavelet transformation inspired from the receptive field and multiresolution arrangement of the simple cells in the primary visual cortex (V1). This transform shows promising abilities for noise elimination. Second, interactions observed between V1 cells consisting in lateral inhibition and in facilitation between aligned cells are shown efficient for extracting edges of natural images. As a third point, the redundancy introduced by the overcompleteness is reduced by a dedicated sparse approximation algorithm which builds a sparse representation of the images based on their edge content. For an additional decorrelation of the image information and for improving the image compression performances, edges arranged along continuous contours are coded in a predictive manner through chains of coefficients. This offers then an efficient representation of contours. Fourth, a study on contour completion using the tensor voting framework based on Gestalt psychology is presented. There, the use of iterations and of the curvature information allow to improve the robustness and the perceptual quality of the existing method.
La presente tesis doctoral tiene como objetivo indagar en algunos paralelismos entre la arquitectura funcional de las áreas visuales primarias y el tratamiento de imágenes. Un primer objetivo consiste en mejorar los modelos existentes de visión biológica basándose en la teoría de la información. Un segundo es el desarrollo de nuevos algoritmos de tratamiento de imágenes inspirados de la visión natural. Los datos disponibles sobre el sistema visual abarcan estudios fisiológicos y psicofísicos, psicología Gestalt y estadísticas de las imágenes naturales. La tesis se centra principalmente en las representaciones sobrecompletas (i.e. representaciones que incrementan la dimensionalidad de los datos) por las siguientes razones. Primero porque permiten sobrepasar importantes desventajas de las transformaciones ortogonales; segundo porque los modelos de visión biológica necesitan a menudo ser sobrecompletos y tercero porque construir representaciones sobrecompletas eficientes involucra problemas matemáticos relevantes y novedosos, en particular el problema de las aproximaciones dispersas. La tesis propone primero una transformación en ondículas log-Gabor auto-inversible inspirada del campo receptivo y la organización en multiresolución de las células simples del cortex visual primario (V1). Esta transformación ofrece resultados prometedores para la eliminación del ruido. En segundo lugar, las interacciones observadas entre las células de V1 que consisten en la inhibición lateral y en la facilitación entre células alineadas se han mostrado eficientes para extraer los bordes de las imágenes naturales. En tercer lugar, la redundancia introducida por la transformación sobrecompleta se reduce gracias a un algoritmo dedicado de aproximación dispersa el cual construye una representación dispersa de las imágenes sobre la base de sus bordes. Para una decorrelación adicional y para conseguir más altas tasas de compresión, los bordes alineados a lo largo de contornos continuos están codificado de manera predictiva por cadenas de coeficientes, lo que ofrece una representacion eficiente de los contornos. Finalmente se presenta un estudio sobre el cierre de contornos utilizando la metodología de tensor voting. Proponemos el uso de iteraciones y de la información de curvatura para mejorar la robustez y la calidad perceptual de los métodos existentes
Pattern Recognition
Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition
The SURE-LET approach to image denoising
Denoising is an essential step prior to any higher-level image-processing tasks such as segmentation or object tracking, because the undesirable corruption by noise is inherent to any physical acquisition device. When the measurements are performed by photosensors, one usually distinguish between two main regimes: in the first scenario, the measured intensities are sufficiently high and the noise is assumed to be signal-independent. In the second scenario, only few photons are detected, which leads to a strong signal-dependent degradation. When the noise is considered as signal-independent, it is often modeled as an additive independent (typically Gaussian) random variable, whereas, otherwise, the measurements are commonly assumed to follow independent Poisson laws, whose underlying intensities are the unknown noise-free measures. We first consider the reduction of additive white Gaussian noise (AWGN). Contrary to most existing denoising algorithms, our approach does not require an explicit prior statistical modeling of the unknown data. Our driving principle is the minimization of a purely data-adaptive unbiased estimate of the mean-squared error (MSE) between the processed and the noise-free data. In the AWGN case, such a MSE estimate was first proposed by Stein, and is known as "Stein's unbiased risk estimate" (SURE). We further develop the original SURE theory and propose a general methodology for fast and efficient multidimensional image denoising, which we call the SURE-LET approach. While SURE allows the quantitative monitoring of the denoising quality, the flexibility and the low computational complexity of our approach are ensured by a linear parameterization of the denoising process, expressed as a linear expansion of thresholds (LET).We propose several pointwise, multivariate, and multichannel thresholding functions applied to arbitrary (in particular, redundant) linear transformations of the input data, with a special focus on multiscale signal representations. We then transpose the SURE-LET approach to the estimation of Poisson intensities degraded by AWGN. The signal-dependent specificity of the Poisson statistics leads to the derivation of a new unbiased MSE estimate that we call "Poisson's unbiased risk estimate" (PURE) and requires more adaptive transform-domain thresholding rules. In a general PURE-LET framework, we first devise a fast interscale thresholding method restricted to the use of the (unnormalized) Haar wavelet transform. We then lift this restriction and show how the PURE-LET strategy can be used to design and optimize a wide class of nonlinear processing applied in an arbitrary (in particular, redundant) transform domain. We finally apply some of the proposed denoising algorithms to real multidimensional fluorescence microscopy images. Such in vivo imaging modality often operates under low-illumination conditions and short exposure time; consequently, the random fluctuations of the measured fluorophore radiations are well described by a Poisson process degraded (or not) by AWGN. We validate experimentally this statistical measurement model, and we assess the performance of the PURE-LET algorithms in comparison with some state-of-the-art denoising methods. Our solution turns out to be very competitive both qualitatively and computationally, allowing for a fast and efficient denoising of the huge volumes of data that are nowadays routinely produced in biomedical imaging
Image Analysis via Applied Harmonic Analysis : Perceptual Image Quality Assessment, Visual Servoing, and Feature Detection
Certain systems of analyzing functions developed in the field of applied harmonic analysis are specifically designed to yield efficient representations of structures which are characteristic of common classes of two-dimensional signals, like images. In particular, functions in these systems are typically sensitive to features that define the geometry of a signal, like edges and curves in the case of images. These properties make them ideal candidates for a wide variety of tasks in image processing and image analysis. This thesis discusses three recently developed approaches to utilizing systems of wavelets, shearlets, and alpha-molecules in specific image analysis tasks. First, a perceptual image similarity measure is introduced that is solely based on the coefficients obtained from six discrete Haar wavelet filters but yields state of the art correlations with human opinion scores on large benchmark databases. The second application concerns visual servoing, which is a technique for controlling the motion of a robot by using feedback from a visual sensor. In particular, it will be investigated how the coefficients yielded by discrete wavelet and shearlet transforms can be used as the visual features that control the motion of a robot with six degrees of freedom. Finally, a novel framework for the detection and characterization of features such as edges, ridges, and blobs in two-dimensional images is presented and evaluated in extensive numerical experiments. Here, versatile and robust feature detectors are obtained by exploiting the special symmetry properties of directionally sensitive analyzing functions in systems created within the recently introduced alpha-molecule framework
Overcomplete Image Representations for Texture Analysis
Advisor/s: Dr. Boris Escalante-Ramírez and Dr. Gabriel Cristóbal. Date and location of PhD thesis defense: 23th October 2013, Universidad Nacional Autónoma de México.In recent years, computer vision has played an important role in many scientific and technological areas mainlybecause modern society highlights vision over other senses. At the same time, application requirements and complexity have also increased so that in many cases the optimal solution depends on the intrinsic charac-teristics of the problem; therefore, it is difficult to propose a universal image model. In parallel, advances in understanding the human visual system have allowed to propose sophisticated models that incorporate simple phenomena which occur in early stages of the visual system. This dissertation aims to investigate characteristicsof vision such as over-representation and orientation of receptive fields in order to propose bio-inspired image models for texture analysis