139 research outputs found
A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity
The richness of natural images makes the quest for optimal representations in
image processing and computer vision challenging. The latter observation has
not prevented the design of image representations, which trade off between
efficiency and complexity, while achieving accurate rendering of smooth regions
as well as reproducing faithful contours and textures. The most recent ones,
proposed in the past decade, share an hybrid heritage highlighting the
multiscale and oriented nature of edges and patterns in images. This paper
presents a panorama of the aforementioned literature on decompositions in
multiscale, multi-orientation bases or dictionaries. They typically exhibit
redundancy to improve sparsity in the transformed domain and sometimes its
invariance with respect to simple geometric deformations (translation,
rotation). Oriented multiscale dictionaries extend traditional wavelet
processing and may offer rotation invariance. Highly redundant dictionaries
require specific algorithms to simplify the search for an efficient (sparse)
representation. We also discuss the extension of multiscale geometric
decompositions to non-Euclidean domains such as the sphere or arbitrary meshed
surfaces. The etymology of panorama suggests an overview, based on a choice of
partially overlapping "pictures". We hope that this paper will contribute to
the appreciation and apprehension of a stream of current research directions in
image understanding.Comment: 65 pages, 33 figures, 303 reference
On The Continuous Steering of the Scale of Tight Wavelet Frames
In analogy with steerable wavelets, we present a general construction of
adaptable tight wavelet frames, with an emphasis on scaling operations. In
particular, the derived wavelets can be "dilated" by a procedure comparable to
the operation of steering steerable wavelets. The fundamental aspects of the
construction are the same: an admissible collection of Fourier multipliers is
used to extend a tight wavelet frame, and the "scale" of the wavelets is
adapted by scaling the multipliers. As an application, the proposed wavelets
can be used to improve the frequency localization. Importantly, the localized
frequency bands specified by this construction can be scaled efficiently using
matrix multiplication
Dynamic Steerable Blocks in Deep Residual Networks
Filters in convolutional networks are typically parameterized in a pixel
basis, that does not take prior knowledge about the visual world into account.
We investigate the generalized notion of frames designed with image properties
in mind, as alternatives to this parametrization. We show that frame-based
ResNets and Densenets can improve performance on Cifar-10+ consistently, while
having additional pleasant properties like steerability. By exploiting these
transformation properties explicitly, we arrive at dynamic steerable blocks.
They are an extension of residual blocks, that are able to seamlessly transform
filters under pre-defined transformations, conditioned on the input at training
and inference time. Dynamic steerable blocks learn the degree of invariance
from data and locally adapt filters, allowing them to apply a different
geometrical variant of the same filter to each location of the feature map.
When evaluated on the Berkeley Segmentation contour detection dataset, our
approach outperforms all competing approaches that do not utilize pre-training.
Our results highlight the benefits of image-based regularization to deep
networks
Riesz pyramids for fast phase-based video magnification
We present a new compact image pyramid representation, the Riesz pyramid, that can be used for real-time phase-based motion magnification. Our new representation is less overcomplete than even the smallest two orientation, octave-bandwidth complex steerable pyramid, and can be implemented using compact, efficient linear filters in the spatial domain. Motion-magnified videos produced with this new representation are of comparable quality to those produced with the complex steerable pyramid. When used with phase-based video magnification, the Riesz pyramid phase-shifts image features along only their dominant orientation rather than every orientation like the complex steerable pyramid.Quanta Computer (Firm)Shell ResearchNational Science Foundation (U.S.) (CGV-1111415)Microsoft Research (PhD Fellowship)Massachusetts Institute of Technology. Department of MathematicsNational Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374
- …