123,946 research outputs found

    Digital Library Services for Three-Dimensional Models

    Get PDF
    With the growth in computing, storage and networking infrastructure, it is becoming increasingly feasible for multimedia professionals—such as graphic designers in commercial, manufacturing, scientific and entertainment areas—to work with 3D digital models of the objects with which they deal in their domain. Unfortunately most of these models exist in individual repositories, and are not accessible to geographically distributed professionals who are in need of them. Building an efficient digital library system presents a number of challenges. In particular, the following issues need to be addressed: (1) What is the best way of representing 3D models in a digital library, so that the searches can be done faster? (2) How to compress and deliver the 3D models to reduce the storage and bandwidth requirements? (3) How can we represent the user\u27s view on similarity between two objects? (4) What search types can be used to enhance the usability of the digital library and how can we implement these searches, what are the trade-offs? In this research, we have developed a digital library architecture for 3D models that addresses the above issues as well as other technical issues. We have developed a prototype for our 3D digital library (3DLIB) that supports compressed storage, along with retrieval of 3D models. The prototype also supports search and discovery services that are targeted for 3-D models. The key to 3DLIB is a representation of a 3D model that is based on “surface signatures”. This representation captures the shape information of any free-form surface and encodes it into a set of 2D images. We have developed a shape similarity search technique that uses the signature images to compare 3D models. One advantage of the proposed technique is that it works in the compressed domain, thus it eliminates the need for uncompressing in content-based search. Moreover, we have developed an efficient discovery service consisting of a multi-level hierarchical browsing service that enables users to navigate large sets of 3D models. To implement this targeted browsing (find an object that is similar to a given object in a large collection through browsing) we abstract a large set of 3D models to a small set of representative models (key models). The abstraction is based on shape similarity and uses specially tailored clustering techniques. The browsing service applies clustering recursively to limit the number of key models that a user views at any time. We have evaluated the performance of our digital library services using the Princeton Shape Benchmark (PSB) and it shows significantly better precision and recall, as compared to other approaches

    Single image 3D shape retrieval viaCross-Modal instance and category contrastive learning

    Get PDF
    In this work, we tackle the problem of single image-based 3D shape retrieval (IBSR), where we seek to find the most matched shape of a given single 2D image from a shape repository. Most of the existing works learn to embed 2D images and 3D shapes into a common feature space and perform metric learning using a triplet loss. Inspired by the great success in recent contrastive learning works on self-supervised representation learning, we propose a novel IBSR pipeline leveraging contrastive learning. We note that adopting such cross-modal contrastive learning between 2D images and 3D shapes into IBSR tasks is non-trivial and challenging: contrastive learning requires very strong data augmentation in constructed positive pairs to learn the feature invariance, whereas traditional metric learning works do not have this requirement. Moreover, object shape and appearance are entangled in 2D query images, thus making the learning task more difficult than contrasting single-modal data. To mitigate the challenges, we propose to use multi-view grayscale rendered images from the 3D shapes as a shape representation. We then introduce a strong data augmentation technique based on color transfer, which can significantly but naturally change the appearance of the query image, effectively satisfying the need for contrastive learning. Finally, we propose to incorporate a novel category-level contrastive loss that helps distinguish similar objects from different categories, in addition to classic instance-level contrastive loss. Our experiments demonstrate that our approach achieves the best performance on all the three popular IBSR benchmarks, including Pix3D, Stanford Cars, and Comp Cars, outperforming the previous state-of-the-art from 4% - 15% on retrieval accuracy

    3D ShapeNets: A Deep Representation for Volumetric Shapes

    Full text link
    3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet -- a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.Comment: to be appeared in CVPR 201

    Learning a Disentangled Embedding for Monocular 3D Shape Retrieval and Pose Estimation

    Full text link
    We propose a novel approach to jointly perform 3D shape retrieval and pose estimation from monocular images.In order to make the method robust to real-world image variations, e.g. complex textures and backgrounds, we learn an embedding space from 3D data that only includes the relevant information, namely the shape and pose. Our approach explicitly disentangles a shape vector and a pose vector, which alleviates both pose bias for 3D shape retrieval and categorical bias for pose estimation. We then train a CNN to map the images to this embedding space, and then retrieve the closest 3D shape from the database and estimate the 6D pose of the object. Our method achieves 10.3 median error for pose estimation and 0.592 top-1-accuracy for category agnostic 3D object retrieval on the Pascal3D+ dataset, outperforming the previous state-of-the-art methods on both tasks

    Prototyping Information Visualization in 3D City Models: a Model-based Approach

    Full text link
    When creating 3D city models, selecting relevant visualization techniques is a particularly difficult user interface design task. A first obstacle is that current geodata-oriented tools, e.g. ArcGIS, have limited 3D capabilities and limited sets of visualization techniques. Another important obstacle is the lack of unified description of information visualization techniques for 3D city models. If many techniques have been devised for different types of data or information (wind flows, air quality fields, historic or legal texts, etc.) they are generally described in articles, and not really formalized. In this paper we address the problem of visualizing information in (rich) 3D city models by presenting a model-based approach for the rapid prototyping of visualization techniques. We propose to represent visualization techniques as the composition of graph transformations. We show that these transformations can be specified with SPARQL construction operations over RDF graphs. These specifications can then be used in a prototype generator to produce 3D scenes that contain the 3D city model augmented with data represented using the desired technique.Comment: Proc. of 3DGeoInfo 2014 Conference, Dubai, November 201
    • …
    corecore