30,604 research outputs found

    Fast space-variant elliptical filtering using box splines

    Get PDF
    The efficient realization of linear space-variant (non-convolution) filters is a challenging computational problem in image processing. In this paper, we demonstrate that it is possible to filter an image with a Gaussian-like elliptic window of varying size, elongation and orientation using a fixed number of computations per pixel. The associated algorithm, which is based on a family of smooth compactly supported piecewise polynomials, the radially-uniform box splines, is realized using pre-integration and local finite-differences. The radially-uniform box splines are constructed through the repeated convolution of a fixed number of box distributions, which have been suitably scaled and distributed radially in an uniform fashion. The attractive features of these box splines are their asymptotic behavior, their simple covariance structure, and their quasi-separability. They converge to Gaussians with the increase of their order, and are used to approximate anisotropic Gaussians of varying covariance simply by controlling the scales of the constituent box distributions. Based on the second feature, we develop a technique for continuously controlling the size, elongation and orientation of these Gaussian-like functions. Finally, the quasi-separable structure, along with a certain scaling property of box distributions, is used to efficiently realize the associated space-variant elliptical filtering, which requires O(1) computations per pixel irrespective of the shape and size of the filter.Comment: 12 figures; IEEE Transactions on Image Processing, vol. 19, 201

    Distance Measures for Reduced Ordering Based Vector Filters

    Full text link
    Reduced ordering based vector filters have proved successful in removing long-tailed noise from color images while preserving edges and fine image details. These filters commonly utilize variants of the Minkowski distance to order the color vectors with the aim of distinguishing between noisy and noise-free vectors. In this paper, we review various alternative distance measures and evaluate their performance on a large and diverse set of images using several effectiveness and efficiency criteria. The results demonstrate that there are in fact strong alternatives to the popular Minkowski metrics

    Multi-directional Geodesic Neural Networks via Equivariant Convolution

    Full text link
    We propose a novel approach for performing convolution of signals on curved surfaces and show its utility in a variety of geometric deep learning applications. Key to our construction is the notion of directional functions defined on the surface, which extend the classic real-valued signals and which can be naturally convolved with with real-valued template functions. As a result, rather than trying to fix a canonical orientation or only keeping the maximal response across all alignments of a 2D template at every point of the surface, as done in previous works, we show how information across all rotations can be kept across different layers of the neural network. Our construction, which we call multi-directional geodesic convolution, or directional convolution for short, allows, in particular, to propagate and relate directional information across layers and thus different regions on the shape. We first define directional convolution in the continuous setting, prove its key properties and then show how it can be implemented in practice, for shapes represented as triangle meshes. We evaluate directional convolution in a wide variety of learning scenarios ranging from classification of signals on surfaces, to shape segmentation and shape matching, where we show a significant improvement over several baselines

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

    Fast adaptive elliptical filtering using box splines

    Full text link
    We demonstrate that it is possible to filter an image with an elliptic window of varying size, elongation and orientation with a fixed computational cost per pixel. Our method involves the application of a suitable global pre-integrator followed by a pointwise-adaptive localization mesh. We present the basic theory for the 1D case using a B-spline formalism and then appropriately extend it to 2D using radially-uniform box splines. The size and ellipticity of these radially-uniform box splines is adaptively controlled. Moreover, they converge to Gaussians as the order increases. Finally, we present a fast and practical directional filtering algorithm that has the capability of adapting to the local image features.Comment: 9 pages, 1 figur

    Complex data processing: fast wavelet analysis on the sphere

    Get PDF
    In the general context of complex data processing, this paper reviews a recent practical approach to the continuous wavelet formalism on the sphere. This formalism notably yields a correspondence principle which relates wavelets on the plane and on the sphere. Two fast algorithms are also presented for the analysis of signals on the sphere with steerable wavelets.Comment: 20 pages, 5 figures, JFAA style, paper invited to J. Fourier Anal. and Appli

    ShearLab 3D: Faithful Digital Shearlet Transforms based on Compactly Supported Shearlets

    Get PDF
    Wavelets and their associated transforms are highly efficient when approximating and analyzing one-dimensional signals. However, multivariate signals such as images or videos typically exhibit curvilinear singularities, which wavelets are provably deficient of sparsely approximating and also of analyzing in the sense of, for instance, detecting their direction. Shearlets are a directional representation system extending the wavelet framework, which overcomes those deficiencies. Similar to wavelets, shearlets allow a faithful implementation and fast associated transforms. In this paper, we will introduce a comprehensive carefully documented software package coined ShearLab 3D (www.ShearLab.org) and discuss its algorithmic details. This package provides MATLAB code for a novel faithful algorithmic realization of the 2D and 3D shearlet transform (and their inverses) associated with compactly supported universal shearlet systems incorporating the option of using CUDA. We will present extensive numerical experiments in 2D and 3D concerning denoising, inpainting, and feature extraction, comparing the performance of ShearLab 3D with similar transform-based algorithms such as curvelets, contourlets, or surfacelets. In the spirit of reproducible reseaerch, all scripts are accessible on www.ShearLab.org.Comment: There is another shearlet software package (http://www.mathematik.uni-kl.de/imagepro/members/haeuser/ffst/) by S. H\"auser and G. Steidl. We will include this in a revisio

    Two-Stream Convolutional Networks for Action Recognition in Videos

    Full text link
    We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. Our contribution is three-fold. First, we propose a two-stream ConvNet architecture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Finally, we show that multi-task learning, applied to two different action classification datasets, can be used to increase the amount of training data and improve the performance on both. Our architecture is trained and evaluated on the standard video actions benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of the art. It also exceeds by a large margin previous attempts to use deep nets for video classification
    • …
    corecore