505 research outputs found
Learning to Extract Motion from Videos in Convolutional Neural Networks
This paper shows how to extract dense optical flow from videos with a
convolutional neural network (CNN). The proposed model constitutes a potential
building block for deeper architectures to allow using motion without resorting
to an external algorithm, \eg for recognition in videos. We derive our network
architecture from signal processing principles to provide desired invariances
to image contrast, phase and texture. We constrain weights within the network
to enforce strict rotation invariance and substantially reduce the number of
parameters to learn. We demonstrate end-to-end training on only 8 sequences of
the Middlebury dataset, orders of magnitude less than competing CNN-based
motion estimation methods, and obtain comparable performance to classical
methods on the Middlebury benchmark. Importantly, our method outputs a
distributed representation of motion that allows representing multiple,
transparent motions, and dynamic textures. Our contributions on network design
and rotation invariance offer insights nonspecific to motion estimation
Limited Visibility and Uncertainty Aware Motion Planning for Automated Driving
Adverse weather conditions and occlusions in urban environments result in
impaired perception. The uncertainties are handled in different modules of an
automated vehicle, ranging from sensor level over situation prediction until
motion planning. This paper focuses on motion planning given an uncertain
environment model with occlusions. We present a method to remain collision free
for the worst-case evolution of the given scene. We define criteria that
measure the available margins to a collision while considering visibility and
interactions, and consequently integrate conditions that apply these criteria
into an optimization-based motion planner. We show the generality of our method
by validating it in several distinct urban scenarios
Controllable Attention for Structured Layered Video Decomposition
The objective of this paper is to be able to separate a video into its
natural layers, and to control which of the separated layers to attend to. For
example, to be able to separate reflections, transparency or object motion. We
make the following three contributions: (i) we introduce a new structured
neural network architecture that explicitly incorporates layers (as spatial
masks) into its design. This improves separation performance over previous
general purpose networks for this task; (ii) we demonstrate that we can augment
the architecture to leverage external cues such as audio for controllability
and to help disambiguation; and (iii) we experimentally demonstrate the
effectiveness of our approach and training procedure with controlled
experiments while also showing that the proposed model can be successfully
applied to real-word applications such as reflection removal and action
recognition in cluttered scenes.Comment: In ICCV 201
Highly accurate optic flow computation with theoretically justified warping
In this paper, we suggest a variational model for optic flow computation based on non-linearised and higher order constancy assumptions. Besides the common grey value constancy assumption, also gradient constancy, as well as the constancy of the Hessian and the Laplacian are proposed. Since the model strictly refrains from a linearisation of these assumptions, it is also capable to deal with large displacements. For the minimisation of the rather complex energy functional, we present an efficient numerical scheme employing two nested fixed point iterations. Following a coarse-to-fine strategy it turns out that there is a theoretical foundation of so-called warping techniques hitherto justified only on an experimental basis. Since our algorithm consists of the integration of various concepts, ranging from different constancy assumptions to numerical implementation issues, a detailed account of the effect of each of these concepts is included in the experimental section. The superior performance of the proposed method shows up by significantly smaller estimation errors when compared to previous techniques. Further experiments also confirm excellent robustness under noise and insensitivity to parameter variations
Verification of Smoke Detection in Video Sequences Based on Spatio-temporal Local Binary Patterns
AbstractThe early smoke detection in outdoor scenes using video sequences is one of the crucial tasks of modern surveillance systems. Real scenes may include objects that are similar to smoke with dynamic behavior due to low resolution cameras, blurring, or weather conditions. Therefore, verification of smoke detection is a necessary stage in such systems. Verification confirms the true smoke regions, when the regions similar to smoke are already detected in a video sequence. The contributions are two-fold. First, many types of Local Binary Patterns (LBPs) in 2D and 3D variants were investigated during experiments according to changing properties of smoke during fire gain. Second, map of brightness differences, edge map, and Laplacian map were studied in Spatio-Temporal LBP (STLBP) specification. The descriptors are based on histograms, and a classification into three classes such as dense smoke, transparent smoke, and non-smoke was implemented using Kullback-Leibler divergence. The recognition results achieved 96–99% and 86–94% of accuracy for dense smoke in dependence of various types of LPBs and shooting artifacts including noise
Optical Flow Estimation versus Motion Estimation
Optical flow estimation is often understood to be identical to dense image based motion estimation. However, only under certain assumptions does optical flow coincide with the projection of the actual 3D motion to the image plane. Most prominently, transparent and glossy scene-surfaces or changes in illumination introduce a difference between the motion of objects in the world and the apparent motion. In this paper we summarize the types of problems occuring in this field and show examples for illustration
- …