Search CORE

613 research outputs found

Geometric Unified Method in 3D Object Classification

Author: Shan Mengyi
Publication venue: Scholarship @ Claremont
Publication date: 01/01/2021
Field of study

3D object classification is one of the most popular topics in the field of computer vision and computational geometry. Currently, the most popular state-of-the-art algorithm is the so-called Convolutional Neural Network (CNN) models with various representations that capture different features of the given 3D data, including voxels, local features, multi-view 2D features, and so on. With CNN as a holistic approach, researches focus on improving the accuracy and efficiency by designing the neural network architecture. This thesis aims to examine the existing work on 3D object classification and explore the underlying theory by integrating geometric approaches. By using geometric algorithms to pre-process and select data points, we dive into an existing architecture of directly feeding points into a deep CNN, and explore how geometry measures how important different points are in a CNN model. Moreover, we attempt to extract useful geometric features directly from the object data to introduce the feature matrix representation, which can be classified with distance-based approaches. We present all results of experiments and analyzed for future improvement

Scholarship@Claremont

Fast Hybrid Cascade for Voxel-based 3D Object Classification

Author: Cai Shen
Cao Hui
Liu Yuqi
Wang Jie
Zhang Siyu
Publication venue
Publication date: 11/01/2021
Field of study

Voxel-based 3D object classification has been frequently studied in recent years. The previous methods often directly convert the classic 2D convolution into a 3D form applied to an object with binary voxel representation. In this paper, we investigate the reason why binary voxel representation is not very suitable for 3D convolution and how to simultaneously improve the performance both in accuracy and speed. We show that by giving each voxel a signed distance value, the accuracy will gain about 30% promotion compared with binary voxel representation using a two-layer fully connected network. We then propose a fast fully connected and convolution hybrid cascade network for voxel-based 3D object classification. This threestage cascade network can divide 3D models into three categories: easy, moderate and hard. Consequently, the mean inference time (0.3ms) can speedup about 5x and 2x compared with the state-of-the-art point cloud and voxel based methods respectively, while achieving the highest accuracy in the latter category of methods (92%). Experiments with ModelNet andMNIST verify the performance of the proposed hybrid cascade network

arXiv.org e-Print Archive

SPNet: Deep 3D Object Classification and Retrieval using Stereographic Projection

Author: Mohsen Yavartanoo
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 전기·컴퓨터공학부,2019. 8. 이경무.본 논문에서는 3D 물체분류 문제를 효율적으로 해결하기위하여 입체화법의 투사를 활용한 모델을 제안한다. 먼저 입체화법의 투사를 사용하여 3D 입력 영상을 2D 평면 이미지로 변환한다. 또한, 객체의 카테고리를 추정하기 위하여 얕은 2D합성곱신셩망(CNN)을 제시하고, 다중시점으로부터 얻은 객체 카테고리의 추정값들을 결합하여 성능을 더욱 향상시키는 앙상블 방법을 제안한다. 이를위해 (1) 입체화법투사를 활용하여 3D 객체를 2D 평면 이미지로 변환하고 (2) 다중시점 영상들의 특징점을 학습 (3) 효과적이고 강인한 시점의 특징점을 선별한 후 (4) 다중시점 앙상블을 통한 성능을 향상시키는 4단계로 구성된 학습방법을 제안한다. 본 논문에서는 실험결과를 통해 제안하는 방법이 매우 적은 모델의 학습 변수와 GPU 메모리를 사용하는과 동시에 객체 분류 및 검색에서의 우수한 성능을 보이고있음을 증명하였다.We propose an efficient Stereographic Projection Neural Network (SPNet) for learning representations of 3D objects. We first transform a 3D input volume into a 2D planar image using stereographic projection. We then present a shallow 2D convolutional neural network (CNN) to estimate the object category followed by view ensemble, which combines the responses from multiple views of the object to further enhance the predictions. Specifically, the proposed approach consists of four stages: (1) Stereographic projection of a 3D object, (2) view-specific feature learning, (3) view selection and (4) view ensemble. The proposed approach performs comparably to the state-of-the-art methods while having substantially lower GPU memory as well as network parameters. Despite its lightness, the experiments on 3D object classification and shape retrievals demonstrate the high performance of the proposed method.1 INTRODUCTION 2 Related Work 2.1 Point cloud-based methods 2.2 3D model-based methods 2.3 2D/2.5D image-based methods 3 Proposed Stereographic Projection Network 3.1 Stereographic Representation 3.2 Network Architecture 3.3 View Selection 3.4 View Ensemble 4 Experimental Evaluation 4.1 Datasets 4.2 Training 4.3 Choice of Stereographic Projection 4.4 Test on View Selection Schemes 4.5 3D Object Classification 4.6 Shape Retrieval 4.7 Implementation 5 ConclusionsMaste

SNU Open Repository and Archive

Machine learning methods for 3D object classification and segmentation

Author: Le Truc Duc
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

Field of study: Computer science.Dr. Ye Duan, Thesis Supervisor.Includes vita."July 2018."Object understanding is a fundamental problem in computer vision and it has been extensively researched in recent years thanks to the availability of powerful GPUs and labelled data, especially in the context of images. However, 3D object understanding is still not on par with its 2D domain and deep learning for 3D has not been fully explored yet. In this dissertation, I work on two approaches, both of which advances the state-of-the-art results in 3D classification and segmentation. The first approach, called MVRNN, is based multi-view paradigm. In contrast to MVCNN which does not generate consistent result across different views, by treating the multi-view images as a temporal sequence, our MVRNN correlates the features and generates coherent segmentation across different views. MVRNN demonstrated state-of-the-art performance on the Princeton Segmentation Benchmark dataset. The second approach, called PointGrid, is a hybrid method which combines points and regular grid structure. 3D points can retain fine details but irregular, which is challenge for deep learning methods. Volumetric grid is simple and has regular structure, but does not scale well with data resolution. Our PointGrid, which is simple, allows the fine details to be consumed by normal convolutions under a coarser resolution grid. PointGrid achieved state-of-the-art performance on ModelNet40 and ShapeNet datasets in 3D classification and object part segmentation.Includes bibliographical references (pages 116-140)

University of Missouri: MOspace

3D object classification using neural networks

Author: Krabec Miroslav
Publication venue: Univerzita Karlova, Matematicko-fyzikální fakulta
Publication date: 01/01/2019
Field of study

Klasifikace 3D objektů pomocí neuronových sítí Bc. Miroslav Krabec Klasifikace 3D objektů se setkává s velkým zájmem v oblasti umělé in- teligence. Existuje mnoho různých přístupů využívajících umělé neuronové sítě k řešení tohoto problému. Liší se hlavně reprezentací 3D modelů, kte- rou neuronové sítě přijímají jako vstup a také svou architekturou. Cílem této práce je prozkoumání a otestování těchto přístupů na veřejně dostupných da- tasetech a jejich podrobení nezávislému srovnání, které se dosud v literatuře neobjevilo. Vytvořili jsme sjednocený framework umožňující převádění dat z běžných 3D formátů. Natrénovali a otestovali jsme deset různých sítí na dvou datasetech (ModelNet40 a ShapeNetCore). Všechny testované sítě podali ro- zumně dobrý výkon, ale většinou nedosáhli přesností (accuracy) uvedených v původních článcích. Domníváme se, že je to způsobeno značným, ale ne- popsaným laděním hyperparametrů na straně autorů původních článků, což naznačuje směr dalšího výzkumu. 13D Object Classification Using Neural Networks Bc. Miroslav Krabec Classification of 3D objects is of great interest in the field of artificial intelligence. There are numerous approaches using artificial neural networks to address this problem. They differ mainly in the representation of the 3D model used as input and the network architecture. The goal of this thesis is to explore and test these approaches on publicly available datasets and subject them to independent comparison, which has not so far appeared in the literature. We provide a unified framework allowing to convert the data from common 3D formats. We train and test ten different network on the ModelNet40 and ShapeNetCore datasets. All the networks performed reasonably well in our tests, but we were generally unable to achieve the accuracies reported in the original papers. We suspect this could be due to extensive, albeit unreported, hyperparameter tuning by the authors of the original papers, suggesting this issue would benefit from further research. 1Katedra softwaru a výuky informatikyDepartment of Software and Computer Science EducationMatematicko-fyzikální fakultaFaculty of Mathematics and Physic

CU Digital Repository