2,057 research outputs found
Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids
We present a global optimization approach to optical flow estimation. The
approach optimizes a classical optical flow objective over the full space of
mappings between discrete grids. No descriptor matching is used. The highly
regular structure of the space of mappings enables optimizations that reduce
the computational complexity of the algorithm's inner loop from quadratic to
linear and support efficient matching of tens of thousands of nodes to tens of
thousands of displacements. We show that one-shot global optimization of a
classical Horn-Schunck-type objective over regular grids at a single resolution
is sufficient to initialize continuous interpolation and achieve
state-of-the-art performance on challenging modern benchmarks.Comment: To be presented at CVPR 201
Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems
Optimization methods are at the core of many problems in signal/image
processing, computer vision, and machine learning. For a long time, it has been
recognized that looking at the dual of an optimization problem may drastically
simplify its solution. Deriving efficient strategies which jointly brings into
play the primal and the dual problems is however a more recent idea which has
generated many important new contributions in the last years. These novel
developments are grounded on recent advances in convex analysis, discrete
optimization, parallel processing, and non-smooth optimization with emphasis on
sparsity issues. In this paper, we aim at presenting the principles of
primal-dual approaches, while giving an overview of numerical methods which
have been proposed in different contexts. We show the benefits which can be
drawn from primal-dual algorithms both for solving large-scale convex
optimization problems and discrete ones, and we provide various application
examples to illustrate their usefulness
DeepMatching: Hierarchical Deformable Dense Matching
We introduce a novel matching algorithm, called DeepMatching, to compute
dense correspondences between images. DeepMatching relies on a hierarchical,
multi-layer, correlational architecture designed for matching images and was
inspired by deep convolutional approaches. The proposed matching algorithm can
handle non-rigid deformations and repetitive textures and efficiently
determines dense correspondences in the presence of significant changes between
images. We evaluate the performance of DeepMatching, in comparison with
state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al
2005), the MPI-Sintel (Butler et al 2012) and the Kitti (Geiger et al 2013)
datasets. DeepMatching outperforms the state-of-the-art algorithms and shows
excellent results in particular for repetitive textures.We also propose a
method for estimating optical flow, called DeepFlow, by integrating
DeepMatching in the large displacement optical flow (LDOF) approach of Brox and
Malik (2011). Compared to existing matching algorithms, additional robustness
to large displacements and complex motion is obtained thanks to our matching
approach. DeepFlow obtains competitive performance on public benchmarks for
optical flow estimation
Line-constrained camera location estimation in multi-image stereomatching
Stereomatching is an effective way of acquiring dense depth information from a scene when active measurements are not possible. So-called lightfield methods take a snapshot from many camera locations along a defined trajectory (usually uniformly linear or on a regular grid—we will assume a linear trajectory) and use this information to compute accurate depth estimates. However, they require the locations for each of the snapshots to be known: the disparity of an object between images is related to both the distance of the camera to the object and the distance between the camera positions for both images. Existing solutions use sparse feature matching for camera location estimation. In this paper, we propose a novel method that uses dense correspondences to do the same, leveraging an existing depth estimation framework to also yield the camera locations along the line. We illustrate the effectiveness of the proposed technique for camera location estimation both visually for the rectification of epipolar plane images and quantitatively with its effect on the resulting depth estimation. Our proposed approach yields a valid alternative for sparse techniques, while still being executed in a reasonable time on a graphics card due to its highly parallelizable nature
VDIP-TGV: Blind Image Deconvolution via Variational Deep Image Prior Empowered by Total Generalized Variation
Recovering clear images from blurry ones with an unknown blur kernel is a
challenging problem. Deep image prior (DIP) proposes to use the deep network as
a regularizer for a single image rather than as a supervised model, which
achieves encouraging results in the nonblind deblurring problem. However, since
the relationship between images and the network architectures is unclear, it is
hard to find a suitable architecture to provide sufficient constraints on the
estimated blur kernels and clean images. Also, DIP uses the sparse maximum a
posteriori (MAP), which is insufficient to enforce the selection of the
recovery image. Recently, variational deep image prior (VDIP) was proposed to
impose constraints on both blur kernels and recovery images and take the
standard deviation of the image into account during the optimization process by
the variational principle. However, we empirically find that VDIP struggles
with processing image details and tends to generate suboptimal results when the
blur kernel is large. Therefore, we combine total generalized variational (TGV)
regularization with VDIP in this paper to overcome these shortcomings of VDIP.
TGV is a flexible regularization that utilizes the characteristics of partial
derivatives of varying orders to regularize images at different scales,
reducing oil painting artifacts while maintaining sharp edges. The proposed
VDIP-TGV effectively recovers image edges and details by supplementing extra
gradient information through TGV. Additionally, this model is solved by the
alternating direction method of multipliers (ADMM), which effectively combines
traditional algorithms and deep learning methods. Experiments show that our
proposed VDIP-TGV surpasses various state-of-the-art models quantitatively and
qualitatively.Comment: 13 pages, 5 figure
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Factor analysis aims to determine latent factors, or traits, which summarize
a given data set. Inter-battery factor analysis extends this notion to multiple
views of the data. In this paper we show how a nonlinear, nonparametric version
of these models can be recovered through the Gaussian process latent variable
model. This gives us a flexible formalism for multi-view learning where the
latent variables can be used both for exploratory purposes and for learning
representations that enable efficient inference for ambiguous estimation tasks.
Learning is performed in a Bayesian manner through the formulation of a
variational compression scheme which gives a rigorous lower bound on the log
likelihood. Our Bayesian framework provides strong regularization during
training, allowing the structure of the latent space to be determined
efficiently and automatically. We demonstrate this by producing the first (to
our knowledge) published results of learning from dozens of views, even when
data is scarce. We further show experimental results on several different types
of multi-view data sets and for different kinds of tasks, including exploratory
data analysis, generation, ambiguity modelling through latent priors and
classification.Comment: 49 pages including appendi
- …