98,249 research outputs found

    A learning approach to 3d object representation for classification

    Get PDF
    Abstract. In this paper we describe our 3D object signature for 3D object classification. The signature is based on a learning approach that finds salient points on a 3D object and represent these points in a 2D spatial map based on a longitude-latitude transformation. Experimental results show high classification rates on both pose-normalized and rotated objects and include a study on classification accuracy as a function of number of rotations in the training set

    CloudWalker: Random walks for 3D point cloud shape analysis

    Full text link
    Point clouds are gaining prominence as a method for representing 3D shapes, but their irregular structure poses a challenge for deep learning methods. In this paper we propose CloudWalker, a novel method for learning 3D shapes using random walks. Previous works attempt to adapt Convolutional Neural Networks (CNNs) or impose a grid or mesh structure to 3D point clouds. This work presents a different approach for representing and learning the shape from a given point set. The key idea is to impose structure on the point set by multiple random walks through the cloud for exploring different regions of the 3D object. Then we learn a per-point and per-walk representation and aggregate multiple walk predictions at inference. Our approach achieves state-of-the-art results for two 3D shape analysis tasks: classification and retrieval

    Par3DNet: Using 3DCNNs for Object Recognition on Tridimensional Partial Views

    Get PDF
    Deep learning-based methods have proven to be the best performers when it comes to object recognition cues both in images and tridimensional data. Nonetheless, when it comes to 3D object recognition, the authors tend to convert the 3D data to images and then perform their classification. However, despite its accuracy, this approach has some issues. In this work, we present a deep learning pipeline for object recognition that takes a point cloud as input and provides the classification probabilities as output. Our proposal is trained on synthetic CAD objects and is able to perform accurately when fed with real data provided by commercial sensors. Unlike most approaches, our method is specifically trained to work on partial views of the objects rather than on a full representation, which is not the representation of the objects as captured by commercial sensors. We trained our proposal with the ModelNet10 dataset and achieved a 78.39% accuracy. We also tested it by adding noise to the dataset and against a number of datasets and real data with high success.This work has been funded by the Spanish Government TIN2016-76515-R grant for the COMBAHO project, supported with Feder funds. It has also been supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887

    FROM 2D TO 3D SUPERVISED SEGMENTATION AND CLASSIFICATION FOR CULTURAL HERITAGE APPLICATIONS

    Get PDF
    The digital management of architectural heritage information is still a complex problem, as a heritage object requires an integrated representation of various types of information in order to develop appropriate restoration or conservation strategies. Currently, there is extensive research focused on automatic procedures of segmentation and classification of 3D point clouds or meshes, which can accelerate the study of a monument and integrate it with heterogeneous information and attributes, useful to characterize and describe the surveyed object. The aim of this study is to propose an optimal, repeatable and reliable procedure to manage various types of 3D surveying data and associate them with heterogeneous information and attributes to characterize and describe the surveyed object. In particular, this paper presents an approach for classifying 3D heritage models, starting from the segmentation of their textures based on supervised machine learning methods. Experimental results run on three different case studies demonstrate that the proposed approach is effective and with many further potentials

    SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2019. 8. ์ด๊ฒฝ๋ฌด.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” 3D ๋ฌผ์ฒด๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๊ธฐ์œ„ํ•˜์—ฌ ์ž…์ฒดํ™”๋ฒ•์˜ ํˆฌ์‚ฌ๋ฅผ ํ™œ์šฉํ•œ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ๋จผ์ € ์ž…์ฒดํ™”๋ฒ•์˜ ํˆฌ์‚ฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 3D ์ž…๋ ฅ ์˜์ƒ์„ 2D ํ‰๋ฉด ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. ๋˜ํ•œ, ๊ฐ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์–•์€ 2Dํ•ฉ์„ฑ๊ณฑ์‹ ์…ฉ๋ง(CNN)์„ ์ œ์‹œํ•˜๊ณ , ๋‹ค์ค‘์‹œ์ ์œผ๋กœ๋ถ€ํ„ฐ ์–ป์€ ๊ฐ์ฒด ์นดํ…Œ๊ณ ๋ฆฌ์˜ ์ถ”์ •๊ฐ’๋“ค์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋”์šฑ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์•™์ƒ๋ธ” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ด๋ฅผ์œ„ํ•ด (1) ์ž…์ฒดํ™”๋ฒ•ํˆฌ์‚ฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ 3D ๊ฐ์ฒด๋ฅผ 2D ํ‰๋ฉด ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜ํ•˜๊ณ  (2) ๋‹ค์ค‘์‹œ์  ์˜์ƒ๋“ค์˜ ํŠน์ง•์ ์„ ํ•™์Šต (3) ํšจ๊ณผ์ ์ด๊ณ  ๊ฐ•์ธํ•œ ์‹œ์ ์˜ ํŠน์ง•์ ์„ ์„ ๋ณ„ํ•œ ํ›„ (4) ๋‹ค์ค‘์‹œ์  ์•™์ƒ๋ธ”์„ ํ†ตํ•œ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” 4๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋œ ํ•™์Šต๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๋งค์šฐ ์ ์€ ๋ชจ๋ธ์˜ ํ•™์Šต ๋ณ€์ˆ˜์™€ GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š”๊ณผ ๋™์‹œ์— ๊ฐ์ฒด ๋ถ„๋ฅ˜ ๋ฐ ๊ฒ€์ƒ‰์—์„œ์˜ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๊ณ ์žˆ์Œ์„ ์ฆ๋ช…ํ•˜์˜€๋‹ค.We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.1 INTRODUCTION 2 Related Work 2.1 Point cloud-based methods 2.2 3D model-based methods 2.3 2D/2.5D image-based methods 3 Proposed Stereographic Projection Network 3.1 Stereographic Representation 3.2 Network Architecture 3.3 View Selection 3.4 View Ensemble 4 Experimental Evaluation 4.1 Datasets 4.2 Training 4.3 Choice of Stereographic Projection 4.4 Test on View Selection Schemes 4.5 3D Object Classification 4.6 Shape Retrieval 4.7 Implementation 5 ConclusionsMaste

    Parametric Procedural Models for 3D Object Retrieval, Classification and Parameterization

    Get PDF
    The amount of 3D objects has grown over the last decades, but we can expect that it will grow much further in the future. 3D objects are also becoming more and more accessible to non-expert users. The growing amount of available 3D data is welcome for everyone working with this type of data, as the creation and acquisition of many 3D objects is still costly. However, the vast majority of available 3D objects are only present as pure polygon meshes. We arguably can not assume to get meta-data and additional semantics delivered together with 3D objects stemming from non-expert or 3D scans of real objects from automatic systems. For this reason content-based retrieval and classification techniques for 3D objects has been developed. Many systems are based on the completely unsupervised case. However, previous work has shown that there are strong possibilities of highly increasing the performance of these tasks by using any type of previous knowledge. In this thesis I use procedural models as previous knowledge. Procedural models describe the construction process of a 3D object instead of explicitly describing the components of the surface. These models can include parameters into the construction process to generate variations of the resulting 3D object. Procedural representations are present in many domains, as these implicit representations are vastly superior to any explicit representation in terms of content generation, flexibility and reusability. Therefore, using a procedural representation always has the potential of outclassing other approaches in many aspects. The usage of procedural models in 3D object retrieval and classification is not highly researched as this powerful representation can be arbitrary complex to create and handle. In the 3D object domain, procedural models are mostly used for highly regularized structures like buildings and trees. However, Procedural models can deeply improve 3D object retrieval and classification, as this representation is able to offer a persistent and reusable full description of a type of object. This description can be used for queries and class definitions without any additional data. Furthermore, the initial classification can be improved further by using a procedural model: A procedural model allows to completely parameterize an unknown object and further identify characteristics of different class members. The only drawback is that the manual design and creation of specialized procedural models itself is very costly. In this thesis I concentrate on the generalization and automation of procedural models for the application in 3D object retrieval and 3D object classification. For the generalization and automation of procedural models I propose to offer different levels of interaction for a user to fulfill the possible needs of control and automation. This thesis presents new approaches for different levels of automation: the automatic generation of procedural models from a single exemplary 3D object. The semi-automatic creation of a procedural model with a sketch-based modeling tool. And the manual definition a procedural model with restricted variation space. The second important step is the insertion of parameters into the procedural model, to define the variations of the resulting 3D object. For this step I also propose several possibilities for the optimal level of control and automation: An automatic parameter detection technique. A semi-automatic deformation based insertion. And an interface for manually inserting parameters by choosing one of the offered insertion principles. It is also possible to manually insert parameters into the procedures if the user needs the full control on the lowest level. To enable the usage of procedural models directly for 3D object retrieval and classification techniques I propose descriptor-based and deep learning based approaches. Descriptors measure the difference of 3D objects. By using descriptors as comparison algorithm, we can define the distance between procedural models and other objects and order these by similarity. The procedural models are sampled and compared to retrieve an optimal object retrieval list. We can also directly use procedural models as data basis for a retraining of a convolutional neural network. By deep learning a set of procedural models we can directly classify new unknown objects without any further large learning database. Additionally, I propose a new multi-layered parameter estimation approach using three different comparison measures to parameterize an unknown object. Hence, an unknown object is not only classified with a procedural model but the approach is also able to gather new information about the characteristics of the object by using the procedural model for the parameterization of the unknown object. As a result, the combination of procedural models with the tasks of 3D object retrieval and classification lead to a meta concept of a holistically seamless system of defining, generating, comparing, identifying, retrieving, recombining, editing and reusing 3D objects

    Saliency-based approaches for multidimensional explainability of deep networks

    Get PDF
    In deep learning, visualization techniques extract the salient patterns exploited by deep networks to perform a task (e.g. image classification) focusing on single images. These methods allow a better understanding of these complex models, empowering the identification of the most informative parts of the input data. Beyond the deep network understanding, visual saliency is useful for many quantitative reasons and applications, both in the 2D and 3D domains, such as the analysis of the generalization capabilities of a classifier and autonomous navigation. In this thesis, we describe an approach to cope with the interpretability problem of a convolutional neural network and propose our ideas on how to exploit the visualization for applications like image classification and active object recognition. After a brief overview on common visualization methods producing attention/saliency maps, we will address two separate points: firstly, we will describe how visual saliency can be effectively used in the 2D domain (e.g. RGB images) to boost image classification performances: as a matter of fact, visual summaries, i.e. a compact representation of an ensemble of saliency maps, can be used to improve the classification accuracy of a network through summary-driven specializations. Then, we will present a 3D active recognition system that allows to consider different views of a target object, overcoming the single-view hypothesis of classical object recognition, making the classification problem much easier in principle. Here we adopt such attention maps in a quantitative fashion, by building a 3D dense saliency volume which fuses together saliency maps obtained from different viewpoints, obtaining a continuous proxy on which parts of an object are more discriminative for a given classifier. Finally, we will show how to inject this representations in a real world application, so that an agent (e.g. robot) can move knowing the capabilities of its classifier
    • โ€ฆ
    corecore