96 research outputs found
Scale Invariant Interest Points with Shearlets
Shearlets are a relatively new directional multi-scale framework for signal
analysis, which have been shown effective to enhance signal discontinuities
such as edges and corners at multiple scales. In this work we address the
problem of detecting and describing blob-like features in the shearlets
framework. We derive a measure which is very effective for blob detection and
closely related to the Laplacian of Gaussian. We demonstrate the measure
satisfies the perfect scale invariance property in the continuous case. In the
discrete setting, we derive algorithms for blob detection and keypoint
description. Finally, we provide qualitative justifications of our findings as
well as a quantitative evaluation on benchmark data. We also report an
experimental evidence that our method is very suitable to deal with compressed
and noisy images, thanks to the sparsity property of shearlets
Directional edge and texture representations for image processing
An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations
A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity
The richness of natural images makes the quest for optimal representations in
image processing and computer vision challenging. The latter observation has
not prevented the design of image representations, which trade off between
efficiency and complexity, while achieving accurate rendering of smooth regions
as well as reproducing faithful contours and textures. The most recent ones,
proposed in the past decade, share an hybrid heritage highlighting the
multiscale and oriented nature of edges and patterns in images. This paper
presents a panorama of the aforementioned literature on decompositions in
multiscale, multi-orientation bases or dictionaries. They typically exhibit
redundancy to improve sparsity in the transformed domain and sometimes its
invariance with respect to simple geometric deformations (translation,
rotation). Oriented multiscale dictionaries extend traditional wavelet
processing and may offer rotation invariance. Highly redundant dictionaries
require specific algorithms to simplify the search for an efficient (sparse)
representation. We also discuss the extension of multiscale geometric
decompositions to non-Euclidean domains such as the sphere or arbitrary meshed
surfaces. The etymology of panorama suggests an overview, based on a choice of
partially overlapping "pictures". We hope that this paper will contribute to
the appreciation and apprehension of a stream of current research directions in
image understanding.Comment: 65 pages, 33 figures, 303 reference
Joint Group Invariant Functions on Data-Parameter Domain Induce Universal Neural Networks
The symmetry and geometry of input data are considered to be encoded in the
internal data representation inside the neural network, but the specific
encoding rule has been less investigated. In this study, we present a
systematic method to induce a generalized neural network and its right inverse
operator, called the ridgelet transform, from a joint group invariant function
on the data-parameter domain. Since the ridgelet transform is an inverse, (1)
it can describe the arrangement of parameters for the network to represent a
target function, which is understood as the encoding rule, and (2) it implies
the universality of the network. Based on the group representation theory, we
present a new simple proof of the universality by using Schur's lemma in a
unified manner covering a wide class of networks, for example, the original
ridgelet transform, formal deep networks, and the dual voice transform. Since
traditional universality theorems were demonstrated based on functional
analysis, this study sheds light on the group theoretic aspect of the
approximation theory, connecting geometric deep learning to abstract harmonic
analysis.Comment: NeurReps 202
- …