21,563 research outputs found
An (MI)LP-based Primal Heuristic for 3-Architecture Connected Facility Location in Urban Access Network Design
We investigate the 3-architecture Connected Facility Location Problem arising
in the design of urban telecommunication access networks. We propose an
original optimization model for the problem that includes additional variables
and constraints to take into account wireless signal coverage. Since the
problem can prove challenging even for modern state-of-the art optimization
solvers, we propose to solve it by an original primal heuristic which combines
a probabilistic fixing procedure, guided by peculiar Linear Programming
relaxations, with an exact MIP heuristic, based on a very large neighborhood
search. Computational experiments on a set of realistic instances show that our
heuristic can find solutions associated with much lower optimality gaps than a
state-of-the-art solver.Comment: This is the authors' final version of the paper published in:
Squillero G., Burelli P. (eds), EvoApplications 2016: Applications of
Evolutionary Computation, LNCS 9597, pp. 283-298, 2016. DOI:
10.1007/978-3-319-31204-0_19. The final publication is available at Springer
via http://dx.doi.org/10.1007/978-3-319-31204-0_1
Multi-Scale 3D Scene Flow from Binocular Stereo Sequences
Scene ļ¬ow methods estimate the three-dimensional motion ļ¬eld for points in the world, using multi-camera video data. Such methods combine multi-view reconstruction with motion estimation. This paper describes an alternative formulation for dense scene ļ¬ow estimation that provides reliable results using only two cameras by fusing stereo and optical ļ¬ow estimation into a single coherent framework. Internally, the proposed algorithm generates probability distributions for optical ļ¬ow and disparity. Taking into account the uncertainty in the intermediate stages allows for more reliable estimation of the 3D scene ļ¬ow than previous methods allow. To handle the aperture problems inherent in the estimation of optical ļ¬ow and disparity, a multi-scale method along with a novel region-based technique is used within a regularized solution. This combined approach both preserves discontinuities and prevents over-regularization ā two problems commonly associated with the basic multi-scale approaches. Experiments with synthetic and real test data demonstrate the strength of the proposed approach.National Science Foundation (CNS-0202067, IIS-0208876); Office of Naval Research (N00014-03-1-0108
Generalized Rank Pooling for Activity Recognition
Most popular deep models for action recognition split video sequences into
short sub-sequences consisting of a few frames; frame-based features are then
pooled for recognizing the activity. Usually, this pooling step discards the
temporal order of the frames, which could otherwise be used for better
recognition. Towards this end, we propose a novel pooling method, generalized
rank pooling (GRP), that takes as input, features from the intermediate layers
of a CNN that is trained on tiny sub-sequences, and produces as output the
parameters of a subspace which (i) provides a low-rank approximation to the
features and (ii) preserves their temporal order. We propose to use these
parameters as a compact representation for the video sequence, which is then
used in a classification setup. We formulate an objective for computing this
subspace as a Riemannian optimization problem on the Grassmann manifold, and
propose an efficient conjugate gradient scheme for solving it. Experiments on
several activity recognition datasets show that our scheme leads to
state-of-the-art performance.Comment: Accepted at IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR), 201
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
We study the problem of synthesizing a number of likely future frames from a
single input image. In contrast to traditional methods, which have tackled this
problem in a deterministic or non-parametric way, we propose a novel approach
that models future frames in a probabilistic manner. Our probabilistic model
makes it possible for us to sample and synthesize many possible future frames
from a single input image. Future frame synthesis is challenging, as it
involves low- and high-level image and motion understanding. We propose a novel
network structure, namely a Cross Convolutional Network to aid in synthesizing
future frames; this network structure encodes image and motion information as
feature maps and convolutional kernels, respectively. In experiments, our model
performs well on synthetic data, such as 2D shapes and animated game sprites,
as well as on real-wold videos. We also show that our model can be applied to
tasks such as visual analogy-making, and present an analysis of the learned
network representations.Comment: The first two authors contributed equally to this wor
Fast Predictive Image Registration
We present a method to predict image deformations based on patch-wise image
appearance. Specifically, we design a patch-based deep encoder-decoder network
which learns the pixel/voxel-wise mapping between image appearance and
registration parameters. Our approach can predict general deformation
parameterizations, however, we focus on the large deformation diffeomorphic
metric mapping (LDDMM) registration model. By predicting the LDDMM
momentum-parameterization we retain the desirable theoretical properties of
LDDMM, while reducing computation time by orders of magnitude: combined with
patch pruning, we achieve a 1500x/66x speed up compared to GPU-based
optimization for 2D/3D image registration. Our approach has better prediction
accuracy than predicting deformation or velocity fields and results in
diffeomorphic transformations. Additionally, we create a Bayesian probabilistic
version of our network, which allows evaluation of deformation field
uncertainty through Monte Carlo sampling using dropout at test time. We show
that deformation uncertainty highlights areas of ambiguous deformations. We
test our method on the OASIS brain image dataset in 2D and 3D
- ā¦