Search CORE

31 research outputs found

Adaptive structure tensors and their applications

Author: A. R. Rao
B. Horn
B. Jähne
D. Tschumperlé
F. Andreu
F. Lauze
H. Haußecker
H.-H. Nagel
I. Thomas
J. Bigün
J. L. Barron
J. Weickert
J. Weickert
J. Weickert
K. Rohr
K. Rohr
L. I. Rudin
M. J. Black
M. Kass
M. Middendorf
M. Nagao
P. Mrázek
P. Perona
S. M. Smith
S. Zenzo Di
T. Brox
T. Brox
W. Förstner
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.1 - Mathematik
Publication date: 01/01/2005
Field of study

The structure tensor, also known as second moment matrix or Förstner interest operator, is a very popular tool in image processing. Its purpose is the estimation of orientation and the local analysis of structure in general. It is based on the integration of data from a local neighborhood. Normally, this neighborhood is defined by a Gaussian window function and the structure tensor is computed by the weighted sum within this window. Some recently proposed methods, however, adapt the computation of the structure tensor to the image data. There are several ways how to do that. This article wants to give an overview of the different approaches, whereas the focus lies on the methods based on robust statistics and nonlinear diffusion. Furthermore, the dataadaptive structure tensors are evaluated in some applications. Here the main focus lies on optic flow estimation, but also texture analysis and corner detection are considered

Crossref

INRIA a CCSD electronic archive server

Copenhagen University Research Information System

Universaar

Acronym

HAL-Ecole des Ponts ParisTech

HAL-Rennes 1

Pareto Meets Huber: Efficiently Avoiding Poor Minima in Robust Estimation

Author: Bourmaud Guillaume
Zach Christopher
Publication venue: HAL CCSD
Publication date: 01/01/2019
Field of study

International audienceRobust cost optimization is the task of fitting parameters to data points containing outliers. In particular, we focus on large-scale computer vision problems, such as bundle adjustment , where Non-Linear Least Square (NLLS) solvers are the current workhorse. In this context, NLLS-based state of the art algorithms have been designed either to quickly improve the target objective and find a local minimum close to the initial value of the parameters, or to have a strong ability to avoid poor local minima. In this paper, we propose a novel algorithm relying on multi-objective optimization which allows to match those two properties. We experimentally demonstrate that our algorithm has an ability to avoid poor local minima that is on par with the best performing algorithms with a faster decrease of the target objective

Crossref

Chalmers Research

Building and Interpreting Deep Similarity Models

Author: Büttner J.
Eberle O.
Kräutli F.
Montavon G.
Müller K.
Valleriani M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2022
Field of study

MPG.PuRe

Locating moving objects in car-driving sequences

Author
Publication venue: Springer
Publication date: 01/05/2014
Field of study

Springer - Publisher Connector

Towards Adversarial Robustness of Deep Vision Algorithms

Author: Yan Hanshu
Publication venue
Publication date: 21/11/2022
Field of study

Deep learning methods have achieved great success in solving computer vision tasks, and they have been widely utilized in artificially intelligent systems for image processing, analysis, and understanding. However, deep neural networks have been shown to be vulnerable to adversarial perturbations in input data. The security issues of deep neural networks have thus come to the fore. It is imperative to study the adversarial robustness of deep vision algorithms comprehensively. This talk focuses on the adversarial robustness of image classification models and image denoisers. We will discuss the robustness of deep vision algorithms from three perspectives: 1) robustness evaluation (we propose the ObsAtk to evaluate the robustness of denoisers), 2) robustness improvement (HAT, TisODE, and CIFS are developed to robustify vision models), and 3) the connection between adversarial robustness and generalization capability to new domains (we find that adversarially robust denoisers can deal with unseen types of real-world noise).Comment: PhD thesi

arXiv.org e-Print Archive

The generative learning and discriminative fitting of linear deformable models

Author: Saragih Jason Mora
Publication venue
Publication date: 17/08/2018
Field of study

The Australian National University

Recommended from our members

Foreground detection of video through the integration of novel multiple detection algorithims

Author: Nawaz Muhammad
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityThe main outcomes of this research are the design of a foreground detection algorithm, which is more accurate and less time consuming than existing algorithms. By the term accuracy we mean an exact mask (which satisfies the respective ground truth value) of the foreground object(s). Motion detection being the prior component of foreground detection process can be achieved via pixel based and block based methods, both of which have their own merits and disadvantages. Pixel based methods are efficient in terms of accuracy but a time consuming process, so cannot be recommended for real time applications. On the other hand block based motion estimation has relatively less accuracy but consumes less time and is thus ideal for real-time applications. In the first proposed algorithm, block based motion estimation technique is opted for timely execution. To overcome the issue of accuracy another morphological based technique was adopted called opening-and-closing by reconstruction, which is a pixel based operation so produces higher accuracy and requires lesser time in execution. Morphological operation opening-and-closing by reconstruction finds the maxima and minima inside the foreground object(s). Thus this novel simultaneous process compensates for the lower accuracy of block based motion estimation. To verify the efficiency of this algorithm a complex video consisting of multiple colours, and fast and slow motions at various places was selected. Based on 11 different performance measures the proposed algorithm achieved an average accuracy of more than 24.73% than four of the well-established algorithms. Background subtraction, being the most cited algorithm for foreground detection, encounters the major problem of proper threshold value at run time. For effective value of the threshold at run time in background subtraction algorithm, the primary component of the foreground detection process, motion is used, in this next proposed algorithm. For the said purpose the smooth histogram peaks and valley of the motion were analyzed, which reflects the high and slow motion areas of the moving object(s) in the given frame and generates the threshold value at run time by exploiting the values of peaks and valley. This proposed algorithm was tested using four recommended video sequences including indoor and outdoor shoots, and were compared with five high ranked algorithms. Based on the values of standard performance measures, the proposed algorithm achieved an average of more than 12.30% higher accuracy results

Brunel University Research Archive

Learning Inference Models for Computer Vision

Author: Jampani Varun
Publication venue: Universität Tübingen
Publication date: 01/01/2016
Field of study

Computer vision can be understood as the ability to perform 'inference' on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques, as even the model design is often dictated by the complexity of inference in them. This thesis proposes learning based inference schemes and demonstrates applications in computer vision. We propose techniques for inference in both generative and discriminative computer vision models. Despite their intuitive appeal, the use of generative models in vision is hampered by the difficulty of posterior inference, which is often too complex or too slow to be practical. We propose techniques for improving inference in two widely used techniques: Markov Chain Monte Carlo (MCMC) sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative vision models show that the proposed techniques accelerate the inference process and/or converge to better solutions. A main complication in the design of discriminative models is the inclusion of prior knowledge in a principled way. For better inference in discriminative models, we propose techniques that modify the original model itself, as inference is simple evaluation of the model. We concentrate on convolutional neural network (CNN) models and propose a generalization of standard spatial convolutions, which are the basic building blocks of CNN architectures, to bilateral convolutions. First, we generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. We show how the bilateral filtering modules can be used for modifying existing CNN architectures for better image segmentation and propose a neural network approach for temporal information propagation in videos. Experiments demonstrate the potential of the proposed bilateral networks on a wide range of vision tasks and datasets. In summary, we propose learning based techniques for better inference in several computer vision models ranging from inverse graphics to freely parameterized neural networks. In generative vision models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way for incorporating prior knowledge into CNNs

arXiv.org e-Print Archive

Publikationsserver der Universität Tübingen

MPG.PuRe

Model-based Optical Flow: Layers, Learning, and Geometry

Author: Wulff Jonas
Publication venue: Universität Tübingen
Publication date: 01/01/2017
Field of study

The estimation of motion in video sequences establishes temporal correspondences between pixels and surfaces and allows reasoning about a scene using multiple frames. Despite being a focus of research for over three decades, computing motion, or optical flow, remains challenging due to a number of difficulties, including the treatment of motion discontinuities and occluded regions, and the integration of information from more than two frames. One reason for these issues is that most optical flow algorithms only reason about the motion of pixels on the image plane, while not taking the image formation pipeline or the 3D structure of the world into account. One approach to address this uses layered models, which represent the occlusion structure of a scene and provide an approximation to the geometry. The goal of this dissertation is to show ways to inject additional knowledge about the scene into layered methods, making them more robust, faster, and more accurate. First, this thesis demonstrates the modeling power of layers using the example of motion blur in videos, which is caused by fast motion relative to the exposure time of the camera. Layers segment the scene into regions that move coherently while preserving their occlusion relationships. The motion of each layer therefore directly determines its motion blur. At the same time, the layered model captures complex blur overlap effects at motion discontinuities. Using layers, we can thus formulate a generative model for blurred video sequences, and use this model to simultaneously deblur a video and compute accurate optical flow for highly dynamic scenes containing motion blur. Next, we consider the representation of the motion within layers. Since, in a layered model, important motion discontinuities are captured by the segmentation into layers, the flow within each layer varies smoothly and can be approximated using a low dimensional subspace. We show how this subspace can be learned from training data using principal component analysis (PCA), and that flow estimation using this subspace is computationally efficient. The combination of the layered model and the low-dimensional subspace gives the best of both worlds, sharp motion discontinuities from the layers and computational efficiency from the subspace. Lastly, we show how layered methods can be dramatically improved using simple semantics. Instead of treating all layers equally, a semantic segmentation divides the scene into its static parts and moving objects. Static parts of the scene constitute a large majority of what is shown in typical video sequences; yet, in such regions optical flow is fully constrained by the depth structure of the scene and the camera motion. After segmenting out moving objects, we consider only static regions, and explicitly reason about the structure of the scene and the camera motion, yielding much better optical flow estimates. Furthermore, computing the structure of the scene allows to better combine information from multiple frames, resulting in high accuracies even in occluded regions. For moving regions, we compute the flow using a generic optical flow method, and combine it with the flow computed for the static regions to obtain a full optical flow field. By combining layered models of the scene with reasoning about the dynamic behavior of the real, three-dimensional world, the methods presented herein push the envelope of optical flow computation in terms of robustness, speed, and accuracy, giving state-of-the-art results on benchmarks and pointing to important future research directions for the estimation of motion in natural scenes

Publikationsserver der Universität Tübingen

MPG.PuRe