123,946 research outputs found
Digital Library Services for Three-Dimensional Models
With the growth in computing, storage and networking infrastructure, it is becoming increasingly feasible for multimedia professionals—such as graphic designers in commercial, manufacturing, scientific and entertainment areas—to work with 3D digital models of the objects with which they deal in their domain. Unfortunately most of these models exist in individual repositories, and are not accessible to geographically distributed professionals who are in need of them.
Building an efficient digital library system presents a number of challenges. In particular, the following issues need to be addressed: (1) What is the best way of representing 3D models in a digital library, so that the searches can be done faster? (2) How to compress and deliver the 3D models to reduce the storage and bandwidth requirements? (3) How can we represent the user\u27s view on similarity between two objects? (4) What search types can be used to enhance the usability of the digital library and how can we implement these searches, what are the trade-offs?
In this research, we have developed a digital library architecture for 3D models that addresses the above issues as well as other technical issues. We have developed a prototype for our 3D digital library (3DLIB) that supports compressed storage, along with retrieval of 3D models. The prototype also supports search and discovery services that are targeted for 3-D models. The key to 3DLIB is a representation of a 3D model that is based on “surface signatures”. This representation captures the shape information of any free-form surface and encodes it into a set of 2D images. We have developed a shape similarity search technique that uses the signature images to compare 3D models. One advantage of the proposed technique is that it works in the compressed domain, thus it eliminates the need for uncompressing in content-based search. Moreover, we have developed an efficient discovery service consisting of a multi-level hierarchical browsing service that enables users to navigate large sets of 3D models. To implement this targeted browsing (find an object that is similar to a given object in a large collection through browsing) we abstract a large set of 3D models to a small set of representative models (key models). The abstraction is based on shape similarity and uses specially tailored clustering techniques. The browsing service applies clustering recursively to limit the number of key models that a user views at any time.
We have evaluated the performance of our digital library services using the Princeton Shape Benchmark (PSB) and it shows significantly better precision and recall, as compared to other approaches
Single image 3D shape retrieval viaCross-Modal instance and category contrastive learning
In this work, we tackle the problem of single image-based 3D shape retrieval (IBSR), where we seek to find the most matched shape of a given single 2D image from a shape repository. Most of the existing works learn to embed 2D images and 3D shapes into a common feature space and perform metric learning using a triplet loss. Inspired by the great success in recent contrastive learning works on self-supervised representation learning, we propose a novel IBSR pipeline leveraging contrastive learning. We note that adopting such cross-modal contrastive learning between 2D images and 3D shapes into IBSR tasks is non-trivial and challenging: contrastive learning requires very strong data augmentation in constructed positive pairs to learn the feature invariance, whereas traditional metric learning works do not have this requirement. Moreover, object shape and appearance are entangled in 2D query images, thus making the learning task more difficult than contrasting single-modal data. To mitigate the challenges, we propose to use multi-view grayscale rendered images from the 3D shapes as a shape representation. We then introduce a strong data augmentation technique based on color transfer, which can significantly but naturally change the appearance of the query image, effectively satisfying the need for contrastive learning. Finally, we propose to incorporate a novel category-level contrastive loss that helps distinguish similar objects from different categories, in addition to classic instance-level contrastive loss. Our experiments demonstrate that our approach achieves the best performance on all the three popular IBSR benchmarks, including Pix3D, Stanford Cars, and Comp Cars, outperforming the previous state-of-the-art from 4% - 15% on retrieval accuracy
3D ShapeNets: A Deep Representation for Volumetric Shapes
3D shape is a crucial but heavily underutilized cue in today's computer
vision systems, mostly due to the lack of a good generic shape representation.
With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft
Kinect), it is becoming increasingly important to have a powerful 3D shape
representation in the loop. Apart from category recognition, recovering full 3D
shapes from view-based 2.5D depth maps is also a critical part of visual
understanding. To this end, we propose to represent a geometric 3D shape as a
probability distribution of binary variables on a 3D voxel grid, using a
Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the
distribution of complex 3D shapes across different object categories and
arbitrary poses from raw CAD data, and discovers hierarchical compositional
part representations automatically. It naturally supports joint object
recognition and shape completion from 2.5D depth maps, and it enables active
object recognition through view planning. To train our 3D deep learning model,
we construct ModelNet -- a large-scale 3D CAD model dataset. Extensive
experiments show that our 3D deep representation enables significant
performance improvement over the-state-of-the-arts in a variety of tasks.Comment: to be appeared in CVPR 201
Learning a Disentangled Embedding for Monocular 3D Shape Retrieval and Pose Estimation
We propose a novel approach to jointly perform 3D shape retrieval and pose
estimation from monocular images.In order to make the method robust to
real-world image variations, e.g. complex textures and backgrounds, we learn an
embedding space from 3D data that only includes the relevant information,
namely the shape and pose. Our approach explicitly disentangles a shape vector
and a pose vector, which alleviates both pose bias for 3D shape retrieval and
categorical bias for pose estimation. We then train a CNN to map the images to
this embedding space, and then retrieve the closest 3D shape from the database
and estimate the 6D pose of the object. Our method achieves 10.3 median error
for pose estimation and 0.592 top-1-accuracy for category agnostic 3D object
retrieval on the Pascal3D+ dataset, outperforming the previous state-of-the-art
methods on both tasks
Prototyping Information Visualization in 3D City Models: a Model-based Approach
When creating 3D city models, selecting relevant visualization techniques is
a particularly difficult user interface design task. A first obstacle is that
current geodata-oriented tools, e.g. ArcGIS, have limited 3D capabilities and
limited sets of visualization techniques. Another important obstacle is the
lack of unified description of information visualization techniques for 3D city
models. If many techniques have been devised for different types of data or
information (wind flows, air quality fields, historic or legal texts, etc.)
they are generally described in articles, and not really formalized. In this
paper we address the problem of visualizing information in (rich) 3D city
models by presenting a model-based approach for the rapid prototyping of
visualization techniques. We propose to represent visualization techniques as
the composition of graph transformations. We show that these transformations
can be specified with SPARQL construction operations over RDF graphs. These
specifications can then be used in a prototype generator to produce 3D scenes
that contain the 3D city model augmented with data represented using the
desired technique.Comment: Proc. of 3DGeoInfo 2014 Conference, Dubai, November 201
- …