14 research outputs found
ODFNet: Using orientation distribution functions to characterize 3D point clouds
Learning new representations of 3D point clouds is an active research area in
3D vision, as the order-invariant point cloud structure still presents
challenges to the design of neural network architectures. Recent works explored
learning either global or local features or both for point clouds, however none
of the earlier methods focused on capturing contextual shape information by
analysing local orientation distribution of points. In this paper, we leverage
on point orientation distributions around a point in order to obtain an
expressive local neighborhood representation for point clouds. We achieve this
by dividing the spherical neighborhood of a given point into predefined cone
volumes, and statistics inside each volume are used as point features. In this
way, a local patch can be represented by not only the selected point's nearest
neighbors, but also considering a point density distribution defined along
multiple orientations around the point. We are then able to construct an
orientation distribution function (ODF) neural network that involves an
ODFBlock which relies on mlp (multi-layer perceptron) layers. The new ODFNet
model achieves state-of the-art accuracy for object classification on
ModelNet40 and ScanObjectNN datasets, and segmentation on ShapeNet S3DIS
datasets.Comment: The paper is under consideration at Computer Vision and Image
Understandin
3D Shape Knowledge Graph for Cross-domain and Cross-modal 3D Shape Retrieval
With the development of 3D modeling and fabrication, 3D shape retrieval has
become a hot topic. In recent years, several strategies have been put forth to
address this retrieval issue. However, it is difficult for them to handle
cross-modal 3D shape retrieval because of the natural differences between
modalities. In this paper, we propose an innovative concept, namely, geometric
words, which is regarded as the basic element to represent any 3D or 2D entity
by combination, and assisted by which, we can simultaneously handle
cross-domain or cross-modal retrieval problems. First, to construct the
knowledge graph, we utilize the geometric word as the node, and then use the
category of the 3D shape as well as the attribute of the geometry to bridge the
nodes. Second, based on the knowledge graph, we provide a unique way for
learning each entity's embedding. Finally, we propose an effective similarity
measure to handle the cross-domain and cross-modal 3D shape retrieval.
Specifically, every 3D or 2D entity could locate its geometric terms in the 3D
knowledge graph, which serve as a link between cross-domain and cross-modal
data. Thus, our approach can achieve the cross-domain and cross-modal 3D shape
retrieval at the same time. We evaluated our proposed method on the ModelNet40
dataset and ShapeNetCore55 dataset for both the 3D shape retrieval task and
cross-domain 3D shape retrieval task. The classic cross-modal dataset (MI3DOR)
is utilized to evaluate cross-modal 3D shape retrieval. Experimental results
and comparisons with state-of-the-art methods illustrate the superiority of our
approach