55,799 research outputs found
Approximate Nearest Neighbor Fields in Video
We introduce RIANN (Ring Intersection Approximate Nearest Neighbor search),
an algorithm for matching patches of a video to a set of reference patches in
real-time. For each query, RIANN finds potential matches by intersecting rings
around key points in appearance space. Its search complexity is reversely
correlated to the amount of temporal change, making it a good fit for videos,
where typically most patches change slowly with time. Experiments show that
RIANN is up to two orders of magnitude faster than previous ANN methods, and is
the only solution that operates in real-time. We further demonstrate how RIANN
can be used for real-time video processing and provide examples for a range of
real-time video applications, including colorization, denoising, and several
artistic effects.Comment: A CVPR 2015 oral pape
Online and Offline Character Recognition Using Alignment to Prototypes
Nearest neighbor classifiers are simple to implement, yet they can model complex non-parametric distributions, and provide state-of-the-art recognition accuracy in OCR databases. At the same time, they may be too slow for practical character recognition, especially when they rely on similarity measures that require computationally expensive pairwise alignments between characters. This paper proposes an efficient method for computing an approximate similarity score between two characters based on their exact alignment to a small number of prototypes. The proposed method is applied to both online and offline character recognition, where similarity is based on widely used and computationally expensive alignment methods, i.e., Dynamic Time Warping and the Hungarian method respectively. In both cases significant recognition speedup is obtained at the expense of only a minor increase in recognition error.Office of Naval Research (N00014-03-1-0108); National Science Foundation (IIS-0308213, EIA-0202067
EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow
We propose a novel approach for optical flow estimation , targeted at large
displacements with significant oc-clusions. It consists of two steps: i) dense
matching by edge-preserving interpolation from a sparse set of matches; ii)
variational energy minimization initialized with the dense matches. The
sparse-to-dense interpolation relies on an appropriate choice of the distance,
namely an edge-aware geodesic distance. This distance is tailored to handle
occlusions and motion boundaries -- two common and difficult issues for optical
flow computation. We also propose an approximation scheme for the geodesic
distance to allow fast computation without loss of performance. Subsequent to
the dense interpolation step, standard one-level variational energy
minimization is carried out on the dense matches to obtain the final flow
estimation. The proposed approach, called Edge-Preserving Interpolation of
Correspondences (EpicFlow) is fast and robust to large displacements. It
significantly outperforms the state of the art on MPI-Sintel and performs on
par on Kitti and Middlebury
On the optimality of shape and data representation in the spectral domain
A proof of the optimality of the eigenfunctions of the Laplace-Beltrami
operator (LBO) in representing smooth functions on surfaces is provided and
adapted to the field of applied shape and data analysis. It is based on the
Courant-Fischer min-max principle adapted to our case. % The theorem we present
supports the new trend in geometry processing of treating geometric structures
by using their projection onto the leading eigenfunctions of the decomposition
of the LBO. Utilisation of this result can be used for constructing numerically
efficient algorithms to process shapes in their spectrum. We review a couple of
applications as possible practical usage cases of the proposed optimality
criteria. % We refer to a scale invariant metric, which is also invariant to
bending of the manifold. This novel pseudo-metric allows constructing an LBO by
which a scale invariant eigenspace on the surface is defined. We demonstrate
the efficiency of an intermediate metric, defined as an interpolation between
the scale invariant and the regular one, in representing geometric structures
while capturing both coarse and fine details. Next, we review a numerical
acceleration technique for classical scaling, a member of a family of
flattening methods known as multidimensional scaling (MDS). There, the
optimality is exploited to efficiently approximate all geodesic distances
between pairs of points on a given surface, and thereby match and compare
between almost isometric surfaces. Finally, we revisit the classical principal
component analysis (PCA) definition by coupling its variational form with a
Dirichlet energy on the data manifold. By pairing the PCA with the LBO we can
handle cases that go beyond the scope defined by the observation set that is
handled by regular PCA
Scalable Image Retrieval by Sparse Product Quantization
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional
feature indexing and retrieval is the crux of large-scale image retrieval. A
recent promising technique is Product Quantization, which attempts to index
high-dimensional image features by decomposing the feature space into a
Cartesian product of low dimensional subspaces and quantizing each of them
separately. Despite the promising results reported, their quantization approach
follows the typical hard assignment of traditional quantization methods, which
may result in large quantization errors and thus inferior search performance.
Unlike the existing approaches, in this paper, we propose a novel approach
called Sparse Product Quantization (SPQ) to encoding the high-dimensional
feature vectors into sparse representation. We optimize the sparse
representations of the feature vectors by minimizing their quantization errors,
making the resulting representation is essentially close to the original data
in practice. Experiments show that the proposed SPQ technique is not only able
to compress data, but also an effective encoding technique. We obtain
state-of-the-art results for ANN search on four public image datasets and the
promising results of content-based image retrieval further validate the
efficacy of our proposed method.Comment: 12 page
Local Descriptors Optimized for Average Precision
Extraction of local feature descriptors is a vital stage in the solution
pipelines for numerous computer vision tasks. Learning-based approaches improve
performance in certain tasks, but still cannot replace handcrafted features in
general. In this paper, we improve the learning of local feature descriptors by
optimizing the performance of descriptor matching, which is a common stage that
follows descriptor extraction in local feature based pipelines, and can be
formulated as nearest neighbor retrieval. Specifically, we directly optimize a
ranking-based retrieval performance metric, Average Precision, using deep
neural networks. This general-purpose solution can also be viewed as a listwise
learning to rank approach, which is advantageous compared to recent local
ranking approaches. On standard benchmarks, descriptors learned with our
formulation achieve state-of-the-art results in patch verification, patch
retrieval, and image matching.Comment: 13 pages, 8 figures. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 201
- …