Search CORE

98,249 research outputs found

A learning approach to 3d object representation for classification

Author: B. Scholkopf
C.H. Lee
D. Lowe
I.H. Witten
J. Assfalg
S.R. Correa
T. Kadir
V.V. Vapnik
Publication venue
Publication date: 01/01/2008
Field of study

Abstract. In this paper we describe our 3D object signature for 3D object classification. The signature is based on a learning approach that finds salient points on a 3D object and represent these points in a 2D spatial map based on a longitude-latitude transformation. Experimental results show high classification rates on both pose-normalized and rotated objects and include a study on classification accuracy as a function of number of rotations in the training set

CiteSeerX

Crossref

CloudWalker: Random walks for 3D point cloud shape analysis

Author: Ben-Shabat Yizhak
Mesika Adi
Tal Ayellet
Publication venue: 'Elsevier BV'
Publication date: 08/07/2022
Field of study

Point clouds are gaining prominence as a method for representing 3D shapes, but their irregular structure poses a challenge for deep learning methods. In this paper we propose CloudWalker, a novel method for learning 3D shapes using random walks. Previous works attempt to adapt Convolutional Neural Networks (CNNs) or impose a grid or mesh structure to 3D point clouds. This work presents a different approach for representing and learning the shape from a given point set. The key idea is to impose structure on the point set by multiple random walks through the cloud for exploring different regions of the 3D object. Then we learn a per-point and per-walk representation and aggregate multiple walk predictions at inference. Our approach achieves state-of-the-art results for two 3D shape analysis tasks: classification and retrieval

arXiv.org e-Print Archive

Par3DNet: Using 3DCNNs for Object Recognition on Tridimensional Partial Views

Author: Cazorla Miguel
Escalona Felix
Gomez-Donoso Francisco
Publication venue: 'MDPI AG'
Publication date: 14/05/2020
Field of study

Deep learning-based methods have proven to be the best performers when it comes to object recognition cues both in images and tridimensional data. Nonetheless, when it comes to 3D object recognition, the authors tend to convert the 3D data to images and then perform their classification. However, despite its accuracy, this approach has some issues. In this work, we present a deep learning pipeline for object recognition that takes a point cloud as input and provides the classification probabilities as output. Our proposal is trained on synthetic CAD objects and is able to perform accurately when fed with real data provided by commercial sensors. Unlike most approaches, our method is specifically trained to work on partial views of the objects rather than on a full representation, which is not the representation of the objects as captured by commercial sensors. We trained our proposal with the ModelNet10 dataset and achieved a 78.39% accuracy. We also tested it by adding noise to the dataset and against a number of datasets and real data with high success.This work has been funded by the Spanish Government TIN2016-76515-R grant for the COMBAHO project, supported with Feder funds. It has also been supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887

Repositorio Institucional de la Universidad de Alicante

FROM 2D TO 3D SUPERVISED SEGMENTATION AND CLASSIFICATION FOR CULTURAL HERITAGE APPLICATIONS

Author: D. Dininno
D. Dininno
E. Grilli
E. Grilli
F. Remondino
G. Petrucci
G. Petrucci
G. Petrucci
Publication venue
Publication date: 01/05/2018
Field of study

The digital management of architectural heritage information is still a complex problem, as a heritage object requires an integrated representation of various types of information in order to develop appropriate restoration or conservation strategies. Currently, there is extensive research focused on automatic procedures of segmentation and classification of 3D point clouds or meshes, which can accelerate the study of a monument and integrate it with heterogeneous information and attributes, useful to characterize and describe the surveyed object. The aim of this study is to propose an optimal, repeatable and reliable procedure to manage various types of 3D surveying data and associate them with heterogeneous information and attributes to characterize and describe the surveyed object. In particular, this paper presents an approach for classifying 3D heritage models, starting from the segmentation of their textures based on supervised machine learning methods. Experimental results run on three different case studies demonstrate that the proposed approach is effective and with many further potentials

Directory of Open Access Journals

Open Access Repository

SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection

Author: Mohsen Yavartanoo
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 전기·컴퓨터공학부,2019. 8. 이경무.본 논문에서는 3D 물체분류 문제를 효율적으로 해결하기위하여 입체화법의 투사를 활용한 모델을 제안한다. 먼저 입체화법의 투사를 사용하여 3D 입력 영상을 2D 평면 이미지로 변환한다. 또한, 객체의 카테고리를 추정하기 위하여 얕은 2D합성곱신셩망(CNN)을 제시하고, 다중시점으로부터 얻은 객체 카테고리의 추정값들을 결합하여 성능을 더욱 향상시키는 앙상블 방법을 제안한다. 이를위해 (1) 입체화법투사를 활용하여 3D 객체를 2D 평면 이미지로 변환하고 (2) 다중시점 영상들의 특징점을 학습 (3) 효과적이고 강인한 시점의 특징점을 선별한 후 (4) 다중시점 앙상블을 통한 성능을 향상시키는 4단계로 구성된 학습방법을 제안한다. 본 논문에서는 실험결과를 통해 제안하는 방법이 매우 적은 모델의 학습 변수와 GPU 메모리를 사용하는과 동시에 객체 분류 및 검색에서의 우수한 성능을 보이고있음을 증명하였다.We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.1 INTRODUCTION 2 Related Work 2.1 Point cloud-based methods 2.2 3D model-based methods 2.3 2D/2.5D image-based methods 3 Proposed Stereographic Projection Network 3.1 Stereographic Representation 3.2 Network Architecture 3.3 View Selection 3.4 View Ensemble 4 Experimental Evaluation 4.1 Datasets 4.2 Training 4.3 Choice of Stereographic Projection 4.4 Test on View Selection Schemes 4.5 3D Object Classification 4.6 Shape Retrieval 4.7 Implementation 5 ConclusionsMaste

SNU Open Repository and Archive

Parametric Procedural Models for 3D Object Retrieval, Classification and Parameterization

Author: Getto Roman
Publication venue
Publication date: 18/04/2019
Field of study

The amount of 3D objects has grown over the last decades, but we can expect that it will grow much further in the future. 3D objects are also becoming more and more accessible to non-expert users. The growing amount of available 3D data is welcome for everyone working with this type of data, as the creation and acquisition of many 3D objects is still costly. However, the vast majority of available 3D objects are only present as pure polygon meshes. We arguably can not assume to get meta-data and additional semantics delivered together with 3D objects stemming from non-expert or 3D scans of real objects from automatic systems. For this reason content-based retrieval and classification techniques for 3D objects has been developed. Many systems are based on the completely unsupervised case. However, previous work has shown that there are strong possibilities of highly increasing the performance of these tasks by using any type of previous knowledge. In this thesis I use procedural models as previous knowledge. Procedural models describe the construction process of a 3D object instead of explicitly describing the components of the surface. These models can include parameters into the construction process to generate variations of the resulting 3D object. Procedural representations are present in many domains, as these implicit representations are vastly superior to any explicit representation in terms of content generation, flexibility and reusability. Therefore, using a procedural representation always has the potential of outclassing other approaches in many aspects. The usage of procedural models in 3D object retrieval and classification is not highly researched as this powerful representation can be arbitrary complex to create and handle. In the 3D object domain, procedural models are mostly used for highly regularized structures like buildings and trees. However, Procedural models can deeply improve 3D object retrieval and classification, as this representation is able to offer a persistent and reusable full description of a type of object. This description can be used for queries and class definitions without any additional data. Furthermore, the initial classification can be improved further by using a procedural model: A procedural model allows to completely parameterize an unknown object and further identify characteristics of different class members. The only drawback is that the manual design and creation of specialized procedural models itself is very costly. In this thesis I concentrate on the generalization and automation of procedural models for the application in 3D object retrieval and 3D object classification. For the generalization and automation of procedural models I propose to offer different levels of interaction for a user to fulfill the possible needs of control and automation. This thesis presents new approaches for different levels of automation: the automatic generation of procedural models from a single exemplary 3D object. The semi-automatic creation of a procedural model with a sketch-based modeling tool. And the manual definition a procedural model with restricted variation space. The second important step is the insertion of parameters into the procedural model, to define the variations of the resulting 3D object. For this step I also propose several possibilities for the optimal level of control and automation: An automatic parameter detection technique. A semi-automatic deformation based insertion. And an interface for manually inserting parameters by choosing one of the offered insertion principles. It is also possible to manually insert parameters into the procedures if the user needs the full control on the lowest level. To enable the usage of procedural models directly for 3D object retrieval and classification techniques I propose descriptor-based and deep learning based approaches. Descriptors measure the difference of 3D objects. By using descriptors as comparison algorithm, we can define the distance between procedural models and other objects and order these by similarity. The procedural models are sampled and compared to retrieve an optimal object retrieval list. We can also directly use procedural models as data basis for a retraining of a convolutional neural network. By deep learning a set of procedural models we can directly classify new unknown objects without any further large learning database. Additionally, I propose a new multi-layered parameter estimation approach using three different comparison measures to parameterize an unknown object. Hence, an unknown object is not only classified with a procedural model but the approach is also able to gather new information about the characteristics of the object by using the procedural model for the parameterization of the unknown object. As a result, the combination of procedural models with the tasks of 3D object retrieval and classification lead to a meta concept of a holistically seamless system of defining, generating, comparing, identifying, retrieving, recombining, editing and reusing 3D objects

TUbiblio

tuprints

Saliency-based approaches for multidimensional explainability of deep networks

Author: Marco Carletti
Publication venue
Publication date: 01/01/2020
Field of study

In deep learning, visualization techniques extract the salient patterns exploited by deep networks to perform a task (e.g. image classification) focusing on single images. These methods allow a better understanding of these complex models, empowering the identification of the most informative parts of the input data. Beyond the deep network understanding, visual saliency is useful for many quantitative reasons and applications, both in the 2D and 3D domains, such as the analysis of the generalization capabilities of a classifier and autonomous navigation. In this thesis, we describe an approach to cope with the interpretability problem of a convolutional neural network and propose our ideas on how to exploit the visualization for applications like image classification and active object recognition. After a brief overview on common visualization methods producing attention/saliency maps, we will address two separate points: firstly, we will describe how visual saliency can be effectively used in the 2D domain (e.g. RGB images) to boost image classification performances: as a matter of fact, visual summaries, i.e. a compact representation of an ensemble of saliency maps, can be used to improve the classification accuracy of a network through summary-driven specializations. Then, we will present a 3D active recognition system that allows to consider different views of a target object, overcoming the single-view hypothesis of classical object recognition, making the classification problem much easier in principle. Here we adopt such attention maps in a quantitative fashion, by building a 3D dense saliency volume which fuses together saliency maps obtained from different viewpoints, obtaining a continuous proxy on which parts of an object are more discriminative for a given classifier. Finally, we will show how to inject this representations in a real world application, so that an agent (e.g. robot) can move knowing the capabilities of its classifier

Catalogo dei prodotti della ricerca