10 research outputs found

    Interactive 3D Modeling with a Generative Adversarial Network

    Full text link
    This paper proposes the idea of using a generative adversarial network (GAN) to assist a novice user in designing real-world shapes with a simple interface. The user edits a voxel grid with a painting interface (like Minecraft). Yet, at any time, he/she can execute a SNAP command, which projects the current voxel grid onto a latent shape manifold with a learned projection operator and then generates a similar, but more realistic, shape using a learned generator network. Then the user can edit the resulting shape and snap again until he/she is satisfied with the result. The main advantage of this approach is that the projection and generation operators assist novice users to create 3D models characteristic of a background distribution of object shapes, but without having to specify all the details. The core new research idea is to use a GAN to support this application. 3D GANs have previously been used for shape generation, interpolation, and completion, but never for interactive modeling. The new challenge for this application is to learn a projection operator that takes an arbitrary 3D voxel model and produces a latent vector on the shape manifold from which a similar and realistic shape can be generated. We develop algorithms for this and other steps of the SNAP processing pipeline and integrate them into a simple modeling tool. Experiments with these algorithms and tool suggest that GANs provide a promising approach to computer-assisted interactive modeling.Comment: Published at International Conference on 3D Vision 2017 (http://irc.cs.sdu.edu.cn/3dv/index.html

    Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

    Full text link
    When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods

    Quasi Spin Images

    Get PDF
    The increasing adoption of 3D capturing equipment, now also found in mobile devices, means that 3D content is increasingly prevalent. Common operations on such data, including 3D object recognition and retrieval, are based on the measurement of similarity between 3D objects. A common way to measure object similarity is through local shape descriptors, which aim to do part-to-part matching by describing portions of an object's shape. The Spin Image is one of the local descriptors most suitable for use in scenes with high degrees of clutter and occlusion but its practical use has been hampered by high computational demands. The rise in processing power of the GPU represents an opportunity to significantly improve the generation and comparison performance of descriptors, such as the Spin Image, thereby increasing the practical applicability of methods making use of it. In this paper we introduce a GPU-based Quasi Spin Image (QSI) algorithm, a variation of the original Spin Image, and show that a speedup of an order of magnitude relative to a reference CPU implementation can be achieved in terms of the image generation rate. In addition, the QSI is noise free, can be computed consistently, and a preliminary evaluation shows it correlates well relative to the original Spin Image

    Advances in 3D Generation: A Survey

    Full text link
    Generating 3D models lies at the core of computer graphics and has been the focus of decades of research. With the emergence of advanced neural representations and generative models, the field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models. The rapid growth of this field makes it difficult to stay abreast of all recent developments. In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications. Specifically, we introduce the 3D representations that serve as the backbone for 3D generation. Furthermore, we provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms, including feedforward generation, optimization-based generation, procedural generation, and generative novel view synthesis. Lastly, we discuss available datasets, applications, and open challenges. We hope this survey will help readers explore this exciting topic and foster further advancements in the field of 3D content generation.Comment: 33 pages, 12 figure

    Parametric Procedural Models for 3D Object Retrieval, Classification and Parameterization

    Get PDF
    The amount of 3D objects has grown over the last decades, but we can expect that it will grow much further in the future. 3D objects are also becoming more and more accessible to non-expert users. The growing amount of available 3D data is welcome for everyone working with this type of data, as the creation and acquisition of many 3D objects is still costly. However, the vast majority of available 3D objects are only present as pure polygon meshes. We arguably can not assume to get meta-data and additional semantics delivered together with 3D objects stemming from non-expert or 3D scans of real objects from automatic systems. For this reason content-based retrieval and classification techniques for 3D objects has been developed. Many systems are based on the completely unsupervised case. However, previous work has shown that there are strong possibilities of highly increasing the performance of these tasks by using any type of previous knowledge. In this thesis I use procedural models as previous knowledge. Procedural models describe the construction process of a 3D object instead of explicitly describing the components of the surface. These models can include parameters into the construction process to generate variations of the resulting 3D object. Procedural representations are present in many domains, as these implicit representations are vastly superior to any explicit representation in terms of content generation, flexibility and reusability. Therefore, using a procedural representation always has the potential of outclassing other approaches in many aspects. The usage of procedural models in 3D object retrieval and classification is not highly researched as this powerful representation can be arbitrary complex to create and handle. In the 3D object domain, procedural models are mostly used for highly regularized structures like buildings and trees. However, Procedural models can deeply improve 3D object retrieval and classification, as this representation is able to offer a persistent and reusable full description of a type of object. This description can be used for queries and class definitions without any additional data. Furthermore, the initial classification can be improved further by using a procedural model: A procedural model allows to completely parameterize an unknown object and further identify characteristics of different class members. The only drawback is that the manual design and creation of specialized procedural models itself is very costly. In this thesis I concentrate on the generalization and automation of procedural models for the application in 3D object retrieval and 3D object classification. For the generalization and automation of procedural models I propose to offer different levels of interaction for a user to fulfill the possible needs of control and automation. This thesis presents new approaches for different levels of automation: the automatic generation of procedural models from a single exemplary 3D object. The semi-automatic creation of a procedural model with a sketch-based modeling tool. And the manual definition a procedural model with restricted variation space. The second important step is the insertion of parameters into the procedural model, to define the variations of the resulting 3D object. For this step I also propose several possibilities for the optimal level of control and automation: An automatic parameter detection technique. A semi-automatic deformation based insertion. And an interface for manually inserting parameters by choosing one of the offered insertion principles. It is also possible to manually insert parameters into the procedures if the user needs the full control on the lowest level. To enable the usage of procedural models directly for 3D object retrieval and classification techniques I propose descriptor-based and deep learning based approaches. Descriptors measure the difference of 3D objects. By using descriptors as comparison algorithm, we can define the distance between procedural models and other objects and order these by similarity. The procedural models are sampled and compared to retrieve an optimal object retrieval list. We can also directly use procedural models as data basis for a retraining of a convolutional neural network. By deep learning a set of procedural models we can directly classify new unknown objects without any further large learning database. Additionally, I propose a new multi-layered parameter estimation approach using three different comparison measures to parameterize an unknown object. Hence, an unknown object is not only classified with a procedural model but the approach is also able to gather new information about the characteristics of the object by using the procedural model for the parameterization of the unknown object. As a result, the combination of procedural models with the tasks of 3D object retrieval and classification lead to a meta concept of a holistically seamless system of defining, generating, comparing, identifying, retrieving, recombining, editing and reusing 3D objects

    A Taxonomy of Freehand Grasping Patterns in Virtual Reality

    Get PDF
    Grasping is the most natural and primary interaction paradigm people perform every day, which allows us to pick up and manipulate objects around us such as drinking a cup of coffee or writing with a pen. Grasping has been highly explored in real environments, to understand and structure the way people grasp and interact with objects by presenting categories, models and theories for grasping approach. Due to the complexity of the human hand, classifying grasping knowledge to provide meaningful insights is a challenging task, which led to researchers developing grasp taxonomies to provide guidelines for emerging grasping work (such as in anthropology, robotics and hand surgery) in a systematic way. While this body of work exists for real grasping, the nuances of grasping transfer in virtual environments is unexplored. The emerging development of robust hand tracking sensors for virtual devices now allow the development of grasp models that enable VR to simulate real grasping interactions. However, present work has not yet explored the differences and nuances that are present in virtual grasping compared to real object grasping, which means that virtual systems that create grasping models based on real grasping knowledge, might make assumptions which are yet to be proven true or untrue around the way users intuitively grasp and interact with virtual objects. To address this, this thesis presents the first user elicitation studies to explore grasping patterns directly in VR. The first study presents main similarities and differences between real and virtual object grasping, the second study furthers this by exploring how virtual object shape influences grasping patterns, the third study focuses on visual thermal cues and how this influences grasp metrics, and the fourth study focuses on understanding other object characteristics such as stability and complexity and how they influence grasps in VR. To provide structured insights on grasping interactions in VR, the results are synthesized in the first VR Taxonomy of Grasp Types, developed following current methods for developing grasping and HCI taxonomies and re-iterated to present an updated and more complete taxonomy. Results show that users appear to mimic real grasping behaviour in VR, however they also illustrate that users present issues around object size estimation and generally a lower variability in grasp types is used. The taxonomy shows that only five grasps account for the majority of grasp data in VR, which can be used for computer systems aiming to achieve natural and intuitive interactions at lower computational cost. Further, findings show that virtual object characteristics such as shape, stability and complexity as well as visual cues for temperature influence grasp metrics such as aperture, category, type, location and dimension. These changes in grasping patterns together with virtual object categorisation methods can be used to inform design decisions when developing intuitive interactions and virtual objects and environments and therefore taking a step forward in achieving natural grasping interaction in VR

    SHREC’14 Track: Large Scale Comprehensive 3D Shape Retrieval

    No full text
    The objective of this track is to evaluate the performance of 3D shape retrieval approaches on a large-sale comprehensive 3D shape database that contains different types of models, such as generic, articulated, CAD and architecture models. The track is based on a new comprehensive 3D shape benchmark, which contains 8,987 triangle meshes that are classified into 171 categories. The benchmark was compiled as a superset of existing benchmarks and presents a new challenge to retrieval methods as it comprises generic models as well as domainspecific model types. In this track, 14 runs have been submitted by 5 groups and their retrieval accuracies were evaluated using 7 commonly used performance metrics
    corecore