17,763 research outputs found
Deep point-to-subspace metric learning for sketch-based 3D shape retrieval
One key issue in managing a large scale 3D shape dataset is to identify an effective way to retrieve a shape-of-interest. The sketch-based query, which enjoys the flexibility in representing the user’s inten- tion, has received growing interests in recent years due to the popularization of the touchscreen tech- nology. Essentially, the sketch depicts an abstraction of a shape in a certain view while the shape con- tains the full 3D information. Matching between them is a cross-modality retrieval problem, and the state-of-the-art solution is to project the sketch and the 3D shape into a common space with which the cross-modality similarity can be calculated by the feature similarity/distance within. However, for a given query, only part of the viewpoints of the 3D shape is representative. Thus, blindly projecting a 3D shape into a feature vector without considering what is the query will inevitably bring query-unrepresentative information. To handle this issue, in this work we propose a Deep Point-to-Subspace Metric Learning (DPSML) framework to project a sketch into a feature vector and a 3D shape into a subspace spanned by a few selected basis feature vectors. The similarity between them is defined as the distance between the query feature vector and its closest point in the subspace by solving an optimization problem on the fly. Note that, the closest point is query-adaptive and can reflect the viewpoint information that is rep- resentative to the given query. To efficiently learn such a deep model, we formulate it as a classification problem with a special classifier design. To reduce the redundancy of 3D shapes, we also introduce a Representative-View Selection (RVS) module to select the most representative views of a 3D shape. By conducting extensive experiments on various datasets, we show that the proposed method can achieve superior performance over its competitive baseline methods and attain the state-of-the-art performance.Yinjie Lei, Ziqin Zhou, Pingping Zhang, Yulan Guo, Zijun Ma, Lingqiao Li
View subspaces for indexing and retrieval of 3D models
View-based indexing schemes for 3D object retrieval are gaining popularity
since they provide good retrieval results. These schemes are coherent with the
theory that humans recognize objects based on their 2D appearances. The
viewbased techniques also allow users to search with various queries such as
binary images, range images and even 2D sketches. The previous view-based
techniques use classical 2D shape descriptors such as Fourier invariants,
Zernike moments, Scale Invariant Feature Transform-based local features and 2D
Digital Fourier Transform coefficients. These methods describe each object
independent of others. In this work, we explore data driven subspace models,
such as Principal Component Analysis, Independent Component Analysis and
Nonnegative Matrix Factorization to describe the shape information of the
views. We treat the depth images obtained from various points of the view
sphere as 2D intensity images and train a subspace to extract the inherent
structure of the views within a database. We also show the benefit of
categorizing shapes according to their eigenvalue spread. Both the shape
categorization and data-driven feature set conjectures are tested on the PSB
database and compared with the competitor view-based 3D shape retrieval
algorithmsComment: Three-Dimensional Image Processing (3DIP) and Applications
(Proceedings Volume) Proceedings of SPIE Volume: 7526 Editor(s): Atilla M.
Baskurt ISBN: 9780819479198 Date: 2 February 201
DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image
3D reconstruction from a single image is a key problem in multiple
applications ranging from robotic manipulation to augmented reality. Prior
methods have tackled this problem through generative models which predict 3D
reconstructions as voxels or point clouds. However, these methods can be
computationally expensive and miss fine details. We introduce a new
differentiable layer for 3D data deformation and use it in DeformNet to learn a
model for 3D reconstruction-through-deformation. DeformNet takes an image
input, searches the nearest shape template from a database, and deforms the
template to match the query image. We evaluate our approach on the ShapeNet
dataset and show that - (a) the Free-Form Deformation layer is a powerful new
building block for Deep Learning models that manipulate 3D data (b) DeformNet
uses this FFD layer combined with shape retrieval for smooth and
detail-preserving 3D reconstruction of qualitatively plausible point clouds
with respect to a single query image (c) compared to other state-of-the-art 3D
reconstruction methods, DeformNet quantitatively matches or outperforms their
benchmarks by significant margins. For more information, visit:
https://deformnet-site.github.io/DeformNet-website/ .Comment: 11 pages, 9 figures, NIP
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
- …