4,550 research outputs found
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
We present a compact but effective CNN model for optical flow, called
PWC-Net. PWC-Net has been designed according to simple and well-established
principles: pyramidal processing, warping, and the use of a cost volume. Cast
in a learnable feature pyramid, PWC-Net uses the cur- rent optical flow
estimate to warp the CNN features of the second image. It then uses the warped
features and features of the first image to construct a cost volume, which is
processed by a CNN to estimate the optical flow. PWC-Net is 17 times smaller in
size and easier to train than the recent FlowNet2 model. Moreover, it
outperforms all published optical flow methods on the MPI Sintel final pass and
KITTI 2015 benchmarks, running at about 35 fps on Sintel resolution (1024x436)
images. Our models are available on https://github.com/NVlabs/PWC-Net.Comment: CVPR 2018 camera ready version (with github link to Caffe and PyTorch
code
Dynamic Estimation of Rigid Motion from Perspective Views via Recursive Identification of Exterior Differential Systems with Parameters on a Topological Manifold
We formulate the problem of estimating the motion of a rigid object viewed under perspective projection as the identification of a dynamic model in Exterior Differential form with parameters on a topological manifold.
We first describe a general method for recursive identification of nonlinear implicit systems using prediction error criteria. The parameters are allowed to move slowly on some topological (not necessarily smooth) manifold. The basic recursion is solved in two different ways: one is based on a simple extension of the traditional Kalman Filter to nonlinear and implicit measurement constraints, the other may be regarded as a generalized "Gauss-Newton" iteration, akin to traditional Recursive Prediction Error Method techniques in linear identification. A derivation of the "Implicit Extended Kalman Filter" (IEKF) is reported in the appendix.
The ID framework is then applied to solving the visual motion problem: it indeed is possible to characterize it in terms of identification of an Exterior Differential System with parameters living on a C0 topological manifold, called the "essential manifold". We consider two alternative estimation paradigms. The first is in the local coordinates of the essential manifold: we estimate the state of a nonlinear implicit model on a linear space. The second is obtained by a linear update on the (linear) embedding space followed by a projection onto the essential manifold. These schemes proved successful in performing the motion estimation task, as we show in experiments on real and noisy synthetic image sequences
Image segmentation and feature extraction for recognizing strokes in tennis game videos
This paper addresses the problem of recognizing human actions from video. Particularly, the case of recognizing events in tennis game videos is analyzed. Driven by our domain knowledge, a robust player segmentation algorithm is developed real video data. Further, we introduce a number of novel features to be extracted for our particular application. Different feature combinations are investigated in order to find the optimal one. Finally, recognition results for different classes of tennis strokes using automatic learning capability of Hidden Markov Models (HMMs) are presented. The experimental results demonstrate that our method is close to realizing statistics of tennis games automatically using ordinary TV broadcast videos
Generalized least squares-based parametric motion estimation and segmentation
El análisis del movimiento es uno de los campos más importantes de la visión por computador. Esto es debido a que el mundo real está en continuo movimiento y es obvio que podremos obtener mucha más información de escenas en movimiento que de escenas estáticas. En esta tesis se ha trabajado principalmente en desarrollar algoritmos de estimación de movimiento para su aplicación a problemas de registrado de imágenes y a problemas de segmentación del movimiento. Uno de los principales objetivos de este trabajo es desarrollar una técnica de registrado de imágenes de gran exactitud, tolerante a outliers y que sea capaz de realizar su labor incluso en la presencia de deformaciones de gran magnitud tales como traslaciones, rotaciones, cambios de escala, cambios de iluminación globales y no espacialmente uniformes, etc. Otro de los objetivos de esta tesis es trabajar en problemas de estimación y la segmentación del movimiento en secuencias de dos imágenes de forma casi simultánea y sin conocimiento a priori del número de modelos de movimiento presentes. Los experimentos mostrados en este trabajo demuestran que los algoritmos propuestos en esta tesis obtienen resultados de gran exactitud.This thesis proposes several techniques related with the motion estimation problem. In particular, it deals with global motion estimation for image registration
and motion segmentation. In the first case, we will suppose that the majority of
the pixels of the image follow the same motion model, although the possibility
of a large number of outliers are also considered. In the motion segmentation
problem, the presence of more than one motion model will be considered. In
both cases, sequences of two consecutive grey level images will be used.
A new generalized least squares-based motion estimator will be proposed. The
proposed formulation of the motion estimation problem provides an additional
constraint that helps to match the pixels using image gradient information. That
is achieved thanks to the use of a weight for each observation, providing high
weight values to the observations considered as inliers, and low values to the ones
considered as outliers. To avoid falling in a local minimum, the proposed motion estimator uses a Feature-based method (SIFT-based) to obtain good initial
motion parameters. Therefore, it can deal with large motions like translation,
rotations, scales changes, viewpoint changes, etc.
The accuracy of our approach has been tested using challenging real images
using both affine and projective motion models. Two Motion Estimator techniques, which use M-Estimators to deal with outliers into a iteratively reweighted
least squared-based strategy, have been selected to compare the accuracy of our
approach. The results obtained have showed that the proposed motion estimator
can obtain as accurate results as M-Estimator-based techniques and even better
in most cases.
The problem of estimating accurately the motion under non-uniform illumination changes will also be considered. A modification of the proposed global
motion estimator will be proposed to deal with this kind of illumination changes.
In particular, a dynamic image model where the illumination factors are functions of the localization will be used replacing the brightens constancy assumption allowing for a more general and accurate image model. Experiments using
challenging images will be performed showing that the combination of both techniques is feasible and provides accurate estimates of the motion parameters even
in the presence of strong illumination changes between the images.
The last part of the thesis deals with the motion estimation and segmentation problem. The proposed algorithm uses temporal information, by using the
proposed generalized least-squares motion estimation process and spatial information by using an iterative region growing algorithm which classifies regions of
pixels into the different motion models present in the sequence. In addition, it
can extract the different moving regions of the scene while estimating its motion
quasi-simultaneously and without a priori information of the number of moving
objects in the scene. The performance of the algorithm will be tested on synthetic
and real images with multiple objects undergoing different types of motion
Generalized least squares-based parametric motion estimation and segmentation
El análisis del movimiento es uno de los campos más importantes de la visión por computador. Esto es debido a que el mundo real está en continuo movimiento y es obvio que podremos obtener mucha más información de escenas en movimiento que de escenas estáticas. En esta tesis se ha trabajado principalmente en desarrollar algoritmos de estimación de movimiento para su aplicación a problemas de registrado de imágenes y a problemas de segmentación del movimiento. Uno de los principales objetivos de este trabajo es desarrollar una técnica de registrado de imágenes de gran exactitud, tolerante a outliers y que sea capaz de realizar su labor incluso en la presencia de deformaciones de gran magnitud tales como traslaciones, rotaciones, cambios de escala, cambios de iluminación globales y no espacialmente uniformes, etc. Otro de los objetivos de esta tesis es trabajar en problemas de estimación y la segmentación del movimiento en secuencias de dos imágenes de forma casi simultánea y sin conocimiento a priori del número de modelos de movimiento presentes. Los experimentos mostrados en este trabajo demuestran que los algoritmos propuestos en esta tesis obtienen resultados de gran exactitud.This thesis proposes several techniques related with the motion estimation problem. In particular, it deals with global motion estimation for image registration
and motion segmentation. In the first case, we will suppose that the majority of
the pixels of the image follow the same motion model, although the possibility
of a large number of outliers are also considered. In the motion segmentation
problem, the presence of more than one motion model will be considered. In
both cases, sequences of two consecutive grey level images will be used.
A new generalized least squares-based motion estimator will be proposed. The
proposed formulation of the motion estimation problem provides an additional
constraint that helps to match the pixels using image gradient information. That
is achieved thanks to the use of a weight for each observation, providing high
weight values to the observations considered as inliers, and low values to the ones
considered as outliers. To avoid falling in a local minimum, the proposed motion estimator uses a Feature-based method (SIFT-based) to obtain good initial
motion parameters. Therefore, it can deal with large motions like translation,
rotations, scales changes, viewpoint changes, etc.
The accuracy of our approach has been tested using challenging real images
using both affine and projective motion models. Two Motion Estimator techniques, which use M-Estimators to deal with outliers into a iteratively reweighted
least squared-based strategy, have been selected to compare the accuracy of our
approach. The results obtained have showed that the proposed motion estimator
can obtain as accurate results as M-Estimator-based techniques and even better
in most cases.
The problem of estimating accurately the motion under non-uniform illumination changes will also be considered. A modification of the proposed global
motion estimator will be proposed to deal with this kind of illumination changes.
In particular, a dynamic image model where the illumination factors are functions of the localization will be used replacing the brightens constancy assumption allowing for a more general and accurate image model. Experiments using
challenging images will be performed showing that the combination of both techniques is feasible and provides accurate estimates of the motion parameters even
in the presence of strong illumination changes between the images.
The last part of the thesis deals with the motion estimation and segmentation problem. The proposed algorithm uses temporal information, by using the
proposed generalized least-squares motion estimation process and spatial information by using an iterative region growing algorithm which classifies regions of
pixels into the different motion models present in the sequence. In addition, it
can extract the different moving regions of the scene while estimating its motion
quasi-simultaneously and without a priori information of the number of moving
objects in the scene. The performance of the algorithm will be tested on synthetic
and real images with multiple objects undergoing different types of motion
Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
This work addresses the problem of semantic scene understanding under dense
fog. Although considerable progress has been made in semantic scene
understanding, it is mainly related to clear-weather scenes. Extending
recognition methods to adverse weather conditions such as fog is crucial for
outdoor applications. In this paper, we propose a novel method, named
Curriculum Model Adaptation (CMAda), which gradually adapts a semantic
segmentation model from light synthetic fog to dense real fog in multiple
steps, using both synthetic and real foggy data. In addition, we present three
other main stand-alone contributions: 1) a novel method to add synthetic fog to
real, clear-weather scenes using semantic input; 2) a new fog density
estimator; 3) the Foggy Zurich dataset comprising real foggy images,
with pixel-level semantic annotations for images with dense fog. Our
experiments show that 1) our fog simulation slightly outperforms a
state-of-the-art competing simulation with respect to the task of semantic
foggy scene understanding (SFSU); 2) CMAda improves the performance of
state-of-the-art models for SFSU significantly by leveraging unlabeled real
foggy data. The datasets and code are publicly available.Comment: final version, ECCV 201
- …