2,163 research outputs found

    Convolutional neural network architecture for geometric matching

    Get PDF
    We address the problem of determining correspondences between two images in agreement with a geometric model such as an affine or thin-plate spline transformation, and estimating its parameters. The contributions of this work are three-fold. First, we propose a convolutional neural network architecture for geometric matching. The architecture is based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end. Second, we demonstrate that the network parameters can be trained from synthetically generated imagery without the need for manual annotation and that our matching layer significantly increases generalization capabilities to never seen before images. Finally, we show that the same model can perform both instance-level and category-level matching giving state-of-the-art results on the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017

    Shape Analysis Using Spectral Geometry

    Get PDF
    Shape analysis is a fundamental research topic in computer graphics and computer vision. To date, more and more 3D data is produced by those advanced acquisition capture devices, e.g., laser scanners, depth cameras, and CT/MRI scanners. The increasing data demands advanced analysis tools including shape matching, retrieval, deformation, etc. Nevertheless, 3D Shapes are represented with Euclidean transformations such as translation, scaling, and rotation and digital mesh representations are irregularly sampled. The shape can also deform non-linearly and the sampling may vary. In order to address these challenging problems, we investigate Laplace-Beltrami shape spectra from the differential geometry perspective, focusing more on the intrinsic properties. In this dissertation, the shapes are represented with 2 manifolds, which are differentiable. First, we discuss in detail about the salient geometric feature points in the Laplace-Beltrami spectral domain instead of traditional spatial domains. Simultaneously, the local shape descriptor of a feature point is the Laplace-Beltrami spectrum of the spatial region associated to the point, which are stable and distinctive. The salient spectral geometric features are invariant to spatial Euclidean transforms, isometric deformations and mesh triangulations. Both global and partial matching can be achieved with these salient feature points. Next, we introduce a novel method to analyze a set of poses, i.e., near-isometric deformations, of 3D models that are unregistered. Different shapes of poses are transformed from the 3D spatial domain to a geometry spectral one where all near isometric deformations, mesh triangulations and Euclidean transformations are filtered away. Semantic parts of that model are then determined based on the computed geometric properties of all the mapped vertices in the geometry spectral domain while semantic skeleton can be automatically built with joints detected. Finally we prove the shape spectrum is a continuous function to a scale function on the conformal factor of the manifold. The derivatives of the eigenvalues are analytically expressed with those of the scale function. The property applies to both continuous domain and discrete triangle meshes. On the triangle meshes, a spectrum alignment algorithm is developed. Given two closed triangle meshes, the eigenvalues can be aligned from one to the other and the eigenfunction distributions are aligned as well. This extends the shape spectra across non-isometric deformations, supporting a registration-free analysis of general motion data

    Two-View Matching with View Synthesis Revisited

    Full text link
    Wide-baseline matching focussing on problems with extreme viewpoint change is considered. We introduce the use of view synthesis with affine-covariant detectors to solve such problems and show that matching with the Hessian-Affine or MSER detectors outperforms the state-of-the-art ASIFT. To minimise the loss of speed caused by view synthesis, we propose the Matching On Demand with view Synthesis algorithm (MODS) that uses progressively more synthesized images and more (time-consuming) detectors until reliable estimation of geometry is possible. We show experimentally that the MODS algorithm solves problems beyond the state-of-the-art and yet is comparable in speed to standard wide-baseline matchers on simpler problems. Minor contributions include an improved method for tentative correspondence selection, applicable both with and without view synthesis and a view synthesis setup greatly improving MSER robustness to blur and scale change that increase its running time by 10% only.Comment: 25 pages, 14 figure

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    L1-norm global geometric consistency for partial-duplicate image retrieval

    Full text link
    In all feature point based partial-duplicate image retrieval systems, false matching is a common issue. To tackle the problem, geometric contexts are widely applied to filter the inconsistent matches. This paper presents a novel method called 1-norm global geometric consistency. We first form the squared distance matrices of all the matched feature points, which remain invariant under translation and rotation between partial-duplicated images. Then we find the scale difference by solving a one-variable 1-norm error minimization problem, where the large sparse errors correspond to the locations of inconsistent matches. By adopting the Golden Section Search method the minimization problem can be solved efficiently. Extensive experimental results show that our method reaches higher precisions than state-of-the-art geometric verification methods in detecting inconsistent matches. Its speed is also highly competitive even when compared to local geometric consistency based methods. ? 2014 IEEE.EI3033-303

    Parsing Objects at a Finer Granularity: A Survey

    Full text link
    Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agriculture, remote sensing, and space technologies. Predominant research efforts tackle these fine-grained sub-tasks following different paradigms, while the inherent relations between these tasks are neglected. Moreover, given most of the research remains fragmented, we conduct an in-depth study of the advanced work from a new perspective of learning the part relationship. In this perspective, we first consolidate recent research and benchmark syntheses with new taxonomies. Based on this consolidation, we revisit the universal challenges in fine-grained part segmentation and recognition tasks and propose new solutions by part relationship learning for these important challenges. Furthermore, we conclude several promising lines of research in fine-grained visual parsing for future research.Comment: Survey for fine-grained part segmentation and object recognition; Accepted by Machine Intelligence Research (MIR

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Structured Sparsity: Discrete and Convex approaches

    Full text link
    Compressive sensing (CS) exploits sparsity to recover sparse or compressible signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity is also used to enhance interpretability in machine learning and statistics applications: While the ambient dimension is vast in modern data analysis problems, the relevant information therein typically resides in a much lower dimensional space. However, many solutions proposed nowadays do not leverage the true underlying structure. Recent results in CS extend the simple sparsity idea to more sophisticated {\em structured} sparsity models, which describe the interdependency between the nonzero components of a signal, allowing to increase the interpretability of the results and lead to better recovery performance. In order to better understand the impact of structured sparsity, in this chapter we analyze the connections between the discrete models and their convex relaxations, highlighting their relative advantages. We start with the general group sparse model and then elaborate on two important special cases: the dispersive and the hierarchical models. For each, we present the models in their discrete nature, discuss how to solve the ensuing discrete problems and then describe convex relaxations. We also consider more general structures as defined by set functions and present their convex proxies. Further, we discuss efficient optimization solutions for structured sparsity problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
    corecore