Search CORE

4,550 research outputs found

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

Author: Kautz Jan
Liu Ming-Yu
Sun Deqing
Yang Xiaodong
Publication venue
Publication date: 25/06/2018
Field of study

We present a compact but effective CNN model for optical flow, called PWC-Net. PWC-Net has been designed according to simple and well-established principles: pyramidal processing, warping, and the use of a cost volume. Cast in a learnable feature pyramid, PWC-Net uses the cur- rent optical flow estimate to warp the CNN features of the second image. It then uses the warped features and features of the first image to construct a cost volume, which is processed by a CNN to estimate the optical flow. PWC-Net is 17 times smaller in size and easier to train than the recent FlowNet2 model. Moreover, it outperforms all published optical flow methods on the MPI Sintel final pass and KITTI 2015 benchmarks, running at about 35 fps on Sintel resolution (1024x436) images. Our models are available on https://github.com/NVlabs/PWC-Net.Comment: CVPR 2018 camera ready version (with github link to Caffe and PyTorch code

arXiv.org e-Print Archive

Crossref

Dynamic Estimation of Rigid Motion from Perspective Views via Recursive Identification of Exterior Differential Systems with Parameters on a Topological Manifold

Author: Frezza Ruggero
Perona Pietro
Soatto Stefano
Publication venue: 'California Institute of Technology Library'
Publication date: 15/02/1994
Field of study

We formulate the problem of estimating the motion of a rigid object viewed under perspective projection as the identification of a dynamic model in Exterior Differential form with parameters on a topological manifold. We first describe a general method for recursive identification of nonlinear implicit systems using prediction error criteria. The parameters are allowed to move slowly on some topological (not necessarily smooth) manifold. The basic recursion is solved in two different ways: one is based on a simple extension of the traditional Kalman Filter to nonlinear and implicit measurement constraints, the other may be regarded as a generalized "Gauss-Newton" iteration, akin to traditional Recursive Prediction Error Method techniques in linear identification. A derivation of the "Implicit Extended Kalman Filter" (IEKF) is reported in the appendix. The ID framework is then applied to solving the visual motion problem: it indeed is possible to characterize it in terms of identification of an Exterior Differential System with parameters living on a C0 topological manifold, called the "essential manifold". We consider two alternative estimation paradigms. The first is in the local coordinates of the essential manifold: we estimate the state of a nonlinear implicit model on a linear space. The second is obtained by a linear update on the (linear) embedding space followed by a projection onto the essential manifold. These schemes proved successful in performing the motion estimation task, as we show in experiments on real and noisy synthetic image sequences

Caltech Authors

Image segmentation and feature extraction for recognizing strokes in tennis game videos

Author: Heijden F. van der
Jonker W.
Petkovic M.
Zivkovic Z.
Publication venue: Advanced School for Computing and Imaging (ASCI)
Publication date: 01/01/2001
Field of study

This paper addresses the problem of recognizing human actions from video. Particularly, the case of recognizing events in tennis game videos is analyzed. Driven by our domain knowledge, a robust player segmentation algorithm is developed real video data. Further, we introduce a number of novel features to be extracted for our particular application. Different feature combinations are investigated in order to find the optimal one. Finally, recognition results for different classes of tennis strokes using automatic learning capability of Hidden Markov Models (HMMs) are presented. The experimental results demonstrate that our method is close to realizing statistics of tennis games automatically using ordinary TV broadcast videos

University of Twente Research Information

Generalized least squares-based parametric motion estimation and segmentation

Author: Montoliu Colás Raúl
Publication venue: 'Universitat Jaume I'
Publication date: 22/09/2008
Field of study

El análisis del movimiento es uno de los campos más importantes de la visión por computador. Esto es debido a que el mundo real está en continuo movimiento y es obvio que podremos obtener mucha más información de escenas en movimiento que de escenas estáticas. En esta tesis se ha trabajado principalmente en desarrollar algoritmos de estimación de movimiento para su aplicación a problemas de registrado de imágenes y a problemas de segmentación del movimiento. Uno de los principales objetivos de este trabajo es desarrollar una técnica de registrado de imágenes de gran exactitud, tolerante a outliers y que sea capaz de realizar su labor incluso en la presencia de deformaciones de gran magnitud tales como traslaciones, rotaciones, cambios de escala, cambios de iluminación globales y no espacialmente uniformes, etc. Otro de los objetivos de esta tesis es trabajar en problemas de estimación y la segmentación del movimiento en secuencias de dos imágenes de forma casi simultánea y sin conocimiento a priori del número de modelos de movimiento presentes. Los experimentos mostrados en este trabajo demuestran que los algoritmos propuestos en esta tesis obtienen resultados de gran exactitud.This thesis proposes several techniques related with the motion estimation problem. In particular, it deals with global motion estimation for image registration and motion segmentation. In the first case, we will suppose that the majority of the pixels of the image follow the same motion model, although the possibility of a large number of outliers are also considered. In the motion segmentation problem, the presence of more than one motion model will be considered. In both cases, sequences of two consecutive grey level images will be used. A new generalized least squares-based motion estimator will be proposed. The proposed formulation of the motion estimation problem provides an additional constraint that helps to match the pixels using image gradient information. That is achieved thanks to the use of a weight for each observation, providing high weight values to the observations considered as inliers, and low values to the ones considered as outliers. To avoid falling in a local minimum, the proposed motion estimator uses a Feature-based method (SIFT-based) to obtain good initial motion parameters. Therefore, it can deal with large motions like translation, rotations, scales changes, viewpoint changes, etc. The accuracy of our approach has been tested using challenging real images using both affine and projective motion models. Two Motion Estimator techniques, which use M-Estimators to deal with outliers into a iteratively reweighted least squared-based strategy, have been selected to compare the accuracy of our approach. The results obtained have showed that the proposed motion estimator can obtain as accurate results as M-Estimator-based techniques and even better in most cases. The problem of estimating accurately the motion under non-uniform illumination changes will also be considered. A modification of the proposed global motion estimator will be proposed to deal with this kind of illumination changes. In particular, a dynamic image model where the illumination factors are functions of the localization will be used replacing the brightens constancy assumption allowing for a more general and accurate image model. Experiments using challenging images will be performed showing that the combination of both techniques is feasible and provides accurate estimates of the motion parameters even in the presence of strong illumination changes between the images. The last part of the thesis deals with the motion estimation and segmentation problem. The proposed algorithm uses temporal information, by using the proposed generalized least-squares motion estimation process and spatial information by using an iterative region growing algorithm which classifies regions of pixels into the different motion models present in the sequence. In addition, it can extract the different moving regions of the scene while estimating its motion quasi-simultaneously and without a priori information of the number of moving objects in the scene. The performance of the algorithm will be tested on synthetic and real images with multiple objects undergoing different types of motion

Tesis Doctorals en Xarxa

Generalized least squares-based parametric motion estimation and segmentation

Author: Montoliu Colás Raúl
Publication venue: 'Universitat Jaume I'
Publication date: 22/09/2008
Field of study

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding

Author: A Bar Hillel
C Sakaridis
GJ Brostow
JP Tarel
K He
K He
K Nishino
LK Choi
M Negru
MB Jensen
N Hautière
O Russakovsky
R Achanta
R Fattal
R Fattal
R Gallen
S Paris
SG Narasimhan
SG Narasimhan
Y Xu
YK Wang
Publication venue
Publication date: 01/01/2018
Field of study

This work addresses the problem of semantic scene understanding under dense fog. Although considerable progress has been made in semantic scene understanding, it is mainly related to clear-weather scenes. Extending recognition methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation model from light synthetic fog to dense real fog in multiple steps, using both synthetic and real foggy data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) the Foggy Zurich dataset comprising

3808

real foggy images, with pixel-level semantic annotations for

16

images with dense fog. Our experiments show that 1) our fog simulation slightly outperforms a state-of-the-art competing simulation with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art models for SFSU significantly by leveraging unlabeled real foggy data. The datasets and code are publicly available.Comment: final version, ECCV 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref