2,163 research outputs found
Convolutional neural network architecture for geometric matching
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2017
Shape Analysis Using Spectral Geometry
Shape analysis is a fundamental research topic in computer graphics and computer vision. To date, more and more 3D data is produced by those advanced acquisition capture devices, e.g., laser scanners, depth cameras, and CT/MRI scanners. The increasing data demands advanced analysis tools including shape matching, retrieval, deformation, etc. Nevertheless, 3D Shapes are represented with Euclidean transformations such as translation, scaling, and rotation and digital mesh representations are irregularly sampled. The shape can also deform non-linearly and the sampling may vary. In order to address these challenging problems, we investigate Laplace-Beltrami shape spectra from the differential geometry perspective, focusing more on the intrinsic properties. In this dissertation, the shapes are represented with 2 manifolds, which are differentiable.
First, we discuss in detail about the salient geometric feature points in the Laplace-Beltrami spectral domain instead of traditional spatial domains. Simultaneously, the local shape descriptor of a feature point is the Laplace-Beltrami spectrum of the spatial region associated to the point, which are stable and distinctive. The salient spectral geometric features are invariant to spatial Euclidean transforms, isometric deformations and mesh triangulations. Both global and partial matching can be achieved with these salient feature points. Next, we introduce a novel method to analyze a set of poses, i.e., near-isometric deformations, of 3D models that are unregistered. Different shapes of poses are transformed from the 3D spatial domain to a geometry spectral one where all near isometric deformations, mesh triangulations and Euclidean transformations are filtered away. Semantic parts of that model are then determined based on the computed geometric properties of all the mapped vertices in the geometry spectral domain while semantic skeleton can be automatically built with joints detected. Finally we prove the shape spectrum is a continuous function to a scale function on the conformal factor of the manifold. The derivatives of the eigenvalues are analytically expressed with those of the scale function. The property applies to both continuous domain and discrete triangle meshes. On the triangle meshes, a spectrum alignment algorithm is developed. Given two closed triangle meshes, the eigenvalues can be aligned from one to the other and the eigenfunction distributions are aligned as well. This extends the shape spectra across non-isometric deformations, supporting a registration-free analysis of general motion data
Two-View Matching with View Synthesis Revisited
Wide-baseline matching focussing on problems with extreme viewpoint change is
considered. We introduce the use of view synthesis with affine-covariant
detectors to solve such problems and show that matching with the Hessian-Affine
or MSER detectors outperforms the state-of-the-art ASIFT.
To minimise the loss of speed caused by view synthesis, we propose the
Matching On Demand with view Synthesis algorithm (MODS) that uses progressively
more synthesized images and more (time-consuming) detectors until reliable
estimation of geometry is possible. We show experimentally that the MODS
algorithm solves problems beyond the state-of-the-art and yet is comparable in
speed to standard wide-baseline matchers on simpler problems.
Minor contributions include an improved method for tentative correspondence
selection, applicable both with and without view synthesis and a view synthesis
setup greatly improving MSER robustness to blur and scale change that increase
its running time by 10% only.Comment: 25 pages, 14 figure
Multimedia Forensics
This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field
L1-norm global geometric consistency for partial-duplicate image retrieval
In all feature point based partial-duplicate image retrieval systems, false matching is a common issue. To tackle the problem, geometric contexts are widely applied to filter the inconsistent matches. This paper presents a novel method called 1-norm global geometric consistency. We first form the squared distance matrices of all the matched feature points, which remain invariant under translation and rotation between partial-duplicated images. Then we find the scale difference by solving a one-variable 1-norm error minimization problem, where the large sparse errors correspond to the locations of inconsistent matches. By adopting the Golden Section Search method the minimization problem can be solved efficiently. Extensive experimental results show that our method reaches higher precisions than state-of-the-art geometric verification methods in detecting inconsistent matches. Its speed is also highly competitive even when compared to local geometric consistency based methods. ? 2014 IEEE.EI3033-303
Parsing Objects at a Finer Granularity: A Survey
Fine-grained visual parsing, including fine-grained part segmentation and
fine-grained object recognition, has attracted considerable critical attention
due to its importance in many real-world applications, e.g., agriculture,
remote sensing, and space technologies. Predominant research efforts tackle
these fine-grained sub-tasks following different paradigms, while the inherent
relations between these tasks are neglected. Moreover, given most of the
research remains fragmented, we conduct an in-depth study of the advanced work
from a new perspective of learning the part relationship. In this perspective,
we first consolidate recent research and benchmark syntheses with new
taxonomies. Based on this consolidation, we revisit the universal challenges in
fine-grained part segmentation and recognition tasks and propose new solutions
by part relationship learning for these important challenges. Furthermore, we
conclude several promising lines of research in fine-grained visual parsing for
future research.Comment: Survey for fine-grained part segmentation and object recognition;
Accepted by Machine Intelligence Research (MIR
Multimedia Forensics
This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
- …