928 research outputs found

    Reflectance Adaptive Filtering Improves Intrinsic Image Estimation

    Full text link
    Separating an image into reflectance and shading layers poses a challenge for learning approaches because no large corpus of precise and realistic ground truth decompositions exists. The Intrinsic Images in the Wild~(IIW) dataset provides a sparse set of relative human reflectance judgments, which serves as a standard benchmark for intrinsic images. A number of methods use IIW to learn statistical dependencies between the images and their reflectance layer. Although learning plays an important role for high performance, we show that a standard signal processing technique achieves performance on par with current state-of-the-art. We propose a loss function for CNN learning of dense reflectance predictions. Our results show a simple pixel-wise decision, without any context or prior knowledge, is sufficient to provide a strong baseline on IIW. This sets a competitive baseline which only two other approaches surpass. We then develop a joint bilateral filtering method that implements strong prior knowledge about reflectance constancy. This filtering operation can be applied to any intrinsic image algorithm and we improve several previous results achieving a new state-of-the-art on IIW. Our findings suggest that the effect of learning-based approaches may have been over-estimated so far. Explicit prior knowledge is still at least as important to obtain high performance in intrinsic image decompositions.Comment: CVPR 201

    Deep Directional Statistics: Pose Estimation with Uncertainty Quantification

    Full text link
    Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low-resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable, we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allows for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art

    Multi-View Priors for Learning Detectors from Sparse Viewpoint Data

    Full text link
    While the majority of today's object class models provide only 2D bounding boxes, far richer output hypotheses are desirable including viewpoint, fine-grained category, and 3D geometry estimate. However, models trained to provide richer output require larger amounts of training data, preferably well covering the relevant aspects such as viewpoint and fine-grained categories. In this paper, we address this issue from the perspective of transfer learning, and design an object class model that explicitly leverages correlations between visual features. Specifically, our model represents prior distributions over permissible multi-view detectors in a parametric way -- the priors are learned once from training data of a source object class, and can later be used to facilitate the learning of a detector for a target class. As we show in our experiments, this transfer is not only beneficial for detectors based on basic-level category representations, but also enables the robust learning of detectors that represent classes at finer levels of granularity, where training data is typically even scarcer and more unbalanced. As a result, we report largely improved performance in simultaneous 2D object localization and viewpoint estimation on a recent dataset of challenging street scenes.Comment: 13 pages, 7 figures, 4 tables, International Conference on Learning Representations 201

    Channel cross-correlations in transport through complex media

    Full text link
    Measuring transmission between four antennas in microwave cavities, we investigate directly the channel cross-correlations CC of the cross sections σab\sigma^{ab} from antenna at ra\vec{r}_a to antenna rb\vec{r}_b. Specifically we look for the CΣC_\Sigma and CΛC_\Lambda, where the only difference is that CΛC_\Lambda has none of the four channels in common, whereas CΣC_\Sigma has exactly one channel in common. We find experimentally that these two channel cross-correlations are anti-phased as a function of the channel coupling strength, as predicted by theory. This anti-correlation is essential to give the correct values for the universal conductance fluctuations. To obtain a good agreement between experiment and predictions from random matrix theory the effect of absorption had to be included.Comment: 6 pages, 5 figure

    3D Object Class Detection in the Wild

    Full text link
    Object class detection has been a synonym for 2D bounding box localization for the longest time, fueled by the success of powerful statistical learning techniques, combined with robust image representations. Only recently, there has been a growing interest in revisiting the promise of computer vision from the early days: to precisely delineate the contents of a visual scene, object by object, in 3D. In this paper, we draw from recent advances in object detection and 2D-3D object lifting in order to design an object class detector that is particularly tailored towards 3D object class detection. Our 3D object class detection method consists of several stages gradually enriching the object detection output with object viewpoint, keypoints and 3D shape estimates. Following careful design, in each stage it constantly improves the performance and achieves state-ofthe-art performance in simultaneous 2D bounding box and viewpoint estimation on the challenging Pascal3D+ dataset

    Volksmusik und Recht im frühneuzeitlichen Mecklenburg

    Get PDF
    Zwischen dem 16. und 19. Jahrhundert beeinflusste der Staat das Musikleben in den mecklenburgischen Herzogtümern, indem er kulturelle Ereignisse durch Auferlegung von Ordnungen regelte und Privilegien in Bezug auf musikalische Aufwartungen gewährte. Es gibt Versuche, die übermäßigen Kosten öffentlicher Veranstaltungen zu reduzieren, unmoralische Verhaltensweisen einzudämmen, die Heiligkeit der Feiertage zu wahren und bestimmte traditionelle Ereignisse wie die Fastnacht und Heischegänge zu unterdrücken. Im 17. Jahrhundert verschwand die Autonomie der offenen Landschaft mit der Privilegierung ausgebildeter Stadtmusikanten in allen mecklenburgischen Ämtern. Diese Privilegien ermöglichten es einem Musiker, in einem bestimmten Verwaltungsbereich ein Monopol zu errichten. Das Eindringen von Stadtmusikanten in die ländliche Sphäre veränderte die traditionelle Musik

    Video Propagation Networks

    Full text link
    We propose a technique that propagates information forward through video data. The method is conceptually simple and can be applied to tasks that require the propagation of structured information, such as semantic labels, based on video content. We propose a 'Video Propagation Network' that processes video frames in an adaptive manner. The model is applied online: it propagates information forward without the need to access future frames. In particular we combine two components, a temporal bilateral network for dense and video adaptive filtering, followed by a spatial network to refine features and increased flexibility. We present experiments on video object segmentation and semantic video segmentation and show increased performance comparing to the best previous task-specific methods, while having favorable runtime. Additionally we demonstrate our approach on an example regression task of color propagation in a grayscale video.Comment: Appearing in Computer Vision and Pattern Recognition, 2017 (CVPR'17

    Spectral properties of microwave graphs with local absorption

    Full text link
    The influence of absorption on the spectra of microwave graphs has been studied experimentally. The microwave networks were made up of coaxial cables and T junctions. First, absorption was introduced by attaching a 50 Ohm load to an additional vertex for graphs with and without time-reversal symmetry. The resulting level-spacing distributions were compared with a generalization of the Wigner surmise in the presence of open channels proposed recently by Poli et al. [Phys. Rev. Lett. 108, 174101 (2012)]. Good agreement was found using an effective coupling parameter. Second, absorption was introduced along one individual bond via a variable microwave attenuator, and the influence of absorption on the length spectrum was studied. The peak heights in the length spectra corresponding to orbits avoiding the absorber were found to be independent of the attenuation, whereas, the heights of the peaks belonging to orbits passing the absorber once or twice showed the expected decrease with increasing attenuation.Comment: 7 pages, 7 figure

    Semantic Video CNNs through Representation Warping

    Full text link
    In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very little extra computational cost. This module is called NetWarp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network representations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training. Experiments validate that the proposed approach incurs only little extra computational cost, while improving performance, when video streams are available. We achieve new state-of-the-art results on the CamVid and Cityscapes benchmark datasets and show consistent improvements over different baseline networks. Our code and models will be available at http://segmentation.is.tue.mpg.deComment: ICCV 201
    corecore