154 research outputs found

    Lattice-Based High-Dimensional Gaussian Filtering and the Permutohedral Lattice

    Get PDF
    High-dimensional Gaussian filtering is a popular technique in image processing, geometry processing and computer graphics for smoothing data while preserving important features. For instance, the bilateral filter, cross bilateral filter and non-local means filter fall under the broad umbrella of high-dimensional Gaussian filters. Recent algorithmic advances therein have demonstrated that by relying on a sampled representation of the underlying space, one can obtain speed-ups of orders of magnitude over the naïve approach. The simplest such sampled representation is a lattice, and it has been used successfully in the bilateral grid and the permutohedral lattice algorithms. In this paper, we analyze these lattice-based algorithms, developing a general theory of lattice-based high-dimensional Gaussian filtering. We consider the set of criteria for an optimal lattice for filtering, as it offers a good tradeoff of quality for computational efficiency, and evaluate the existing lattices under the criteria. In particular, we give a rigorous exposition of the properties of the permutohedral lattice and argue that it is the optimal lattice for Gaussian filtering. Lastly, we explore further uses of the permutohedral-lattice-based Gaussian filtering framework, showing that it can be easily adapted to perform mean shift filtering and yield improvement over the traditional approach based on a Cartesian grid.Stanford University (Reed-Hodgson Fellowship)Nokia Research Cente

    Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks

    Full text link
    Bilateral filters have wide spread use due to their edge-preserving properties. The common use case is to manually choose a parametric filter type, usually a Gaussian filter. In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data. This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces. We build on the permutohedral lattice construction for efficient filtering. The ability to learn more general forms of high-dimensional filters can be used in several diverse applications. First, we demonstrate the use in applications where single filter applications are desired for runtime reasons. Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks. Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data. This view provides new ways to encode model structure into network architectures. A diverse set of experiments empirically validates the usage of general forms of filters

    Efficient Linear Programming for Dense CRFs

    Get PDF
    The fully connected conditional random field (CRF) with Gaussian pairwise potentials has proven popular and effective for multi-class semantic segmentation. While the energy of a dense CRF can be minimized accurately using a linear programming (LP) relaxation, the state-of-the-art algorithm is too slow to be useful in practice. To alleviate this deficiency, we introduce an efficient LP minimization algorithm for dense CRFs. To this end, we develop a proximal minimization framework, where the dual of each proximal problem is optimized via block coordinate descent. We show that each block of variables can be efficiently optimized. Specifically, for one block, the problem decomposes into significantly smaller subproblems, each of which is defined over a single pixel. For the other block, the problem is optimized via conditional gradient descent. This has two advantages: 1) the conditional gradient can be computed in a time linear in the number of pixels and labels; and 2) the optimal step size can be computed analytically. Our experiments on standard datasets provide compelling evidence that our approach outperforms all existing baselines including the previous LP based approach for dense CRFs.Comment: 24 pages, 10 figures and 4 table

    Video Propagation Networks

    Full text link
    We propose a technique that propagates information forward through video data. The method is conceptually simple and can be applied to tasks that require the propagation of structured information, such as semantic labels, based on video content. We propose a 'Video Propagation Network' that processes video frames in an adaptive manner. The model is applied online: it propagates information forward without the need to access future frames. In particular we combine two components, a temporal bilateral network for dense and video adaptive filtering, followed by a spatial network to refine features and increased flexibility. We present experiments on video object segmentation and semantic video segmentation and show increased performance comparing to the best previous task-specific methods, while having favorable runtime. Additionally we demonstrate our approach on an example regression task of color propagation in a grayscale video.Comment: Appearing in Computer Vision and Pattern Recognition, 2017 (CVPR'17

    On a fast bilateral filtering formulation using functional rearrangements

    Full text link
    We introduce an exact reformulation of a broad class of neighborhood filters, among which the bilateral filters, in terms of two functional rearrangements: the decreasing and the relative rearrangements. Independently of the image spatial dimension (one-dimensional signal, image, volume of images, etc.), we reformulate these filters as integral operators defined in a one-dimensional space corresponding to the level sets measures. We prove the equivalence between the usual pixel-based version and the rearranged version of the filter. When restricted to the discrete setting, our reformulation of bilateral filters extends previous results for the so-called fast bilateral filtering. We, in addition, prove that the solution of the discrete setting, understood as constant-wise interpolators, converges to the solution of the continuous setting. Finally, we numerically illustrate computational aspects concerning quality approximation and execution time provided by the rearranged formulation.Comment: 29 pages, Journal of Mathematical Imaging and Vision, 2015. arXiv admin note: substantial text overlap with arXiv:1406.712

    Learning Task-Specific Generalized Convolutions in the Permutohedral Lattice

    Full text link
    Dense prediction tasks typically employ encoder-decoder architectures, but the prevalent convolutions in the decoder are not image-adaptive and can lead to boundary artifacts. Different generalized convolution operations have been introduced to counteract this. We go beyond these by leveraging guidance data to redefine their inherent notion of proximity. Our proposed network layer builds on the permutohedral lattice, which performs sparse convolutions in a high-dimensional space allowing for powerful non-local operations despite small filters. Multiple features with different characteristics span this permutohedral space. In contrast to prior work, we learn these features in a task-specific manner by generalizing the basic permutohedral operations to learnt feature representations. As the resulting objective is complex, a carefully designed framework and learning procedure are introduced, yielding rich feature embeddings in practice. We demonstrate the general applicability of our approach in different joint upsampling tasks. When adding our network layer to state-of-the-art networks for optical flow and semantic segmentation, boundary artifacts are removed and the accuracy is improved.Comment: To appear at GCPR 201

    Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering

    Get PDF
    Scene parsing has attracted a lot of attention in computer vision. While parametric models have proven effective for this task, they cannot easily incorporate new training data. By contrast, nonparametric approaches, which bypass any learning phase and directly transfer the labels from the training data to the query images, can readily exploit new labeled samples as they become available. Unfortunately, because of the computational cost of their label transfer procedures, state-of-the-art nonparametric methods typically filter out most training images to only keep a few relevant ones to label the query. As such, these methods throw away many images that still contain valuable information and generally obtain an unbalanced set of labeled samples. In this paper, we introduce a nonparametric approach to scene parsing that follows a sample-and-filter strategy. More specifically, we propose to sample labeled superpixels according to an image similarity score, which allows us to obtain a balanced set of samples. We then formulate label transfer as an efficient filtering procedure, which lets us exploit more labeled samples than existing techniques. Our experiments evidence the benefits of our approach over state-of-the-art nonparametric methods on two benchmark datasets.Comment: Please refer to the CVPR-2016 version of this manuscrip
    • …
    corecore