11 research outputs found
Design of Adaptive Porous Micro-structures for Additive Manufacturing
AbstractThe emerging field of additive manufacturing with biocompatible materials has led to customized design of porous micro- structures. Complex micro-structures are characterized by freeform surfaces and spatially varying porosity. Today, there is no CAD system that can handle the design of these microstructures due to their high complexity. In this paper we propose a novel approach for generating a 3D adaptive model of a porous micro-structure based on predefined design. Using our approach a designer can manually select a region of interest (ROI) and define its porosity. In the core of our approach, the multi-resolution volumetric model is used. The generation of an adaptive model may contain topological changes that should be considered. In our approach, the process of designing a customized model is composed of the following stages (a) constructing a multi- resolution volumetric model of a porous structure (b) defining regions of interest (ROI) and their resolution properties (c) constructing the adaptive model. In this research, the approach was initially tested on 2D models and then extended to 3D models. The resulted adapted model can be used for design, mechanical analysis and manufacturing. The feasibility of the method has been applied on bone models that were reconstructed from micro-CT images
CloudWalker: Random walks for 3D point cloud shape analysis
Point clouds are gaining prominence as a method for representing 3D shapes,
but their irregular structure poses a challenge for deep learning methods. In
this paper we propose CloudWalker, a novel method for learning 3D shapes using
random walks. Previous works attempt to adapt Convolutional Neural Networks
(CNNs) or impose a grid or mesh structure to 3D point clouds. This work
presents a different approach for representing and learning the shape from a
given point set. The key idea is to impose structure on the point set by
multiple random walks through the cloud for exploring different regions of the
3D object. Then we learn a per-point and per-walk representation and aggregate
multiple walk predictions at inference. Our approach achieves state-of-the-art
results for two 3D shape analysis tasks: classification and retrieval
DiGS : Divergence guided shape implicit neural representation for unoriented point clouds
Neural shape representations have recently shown to be effective in shape
analysis and reconstruction tasks. Existing neural network methods require
point coordinates and corresponding normal vectors to learn the implicit level
sets of the shape. Normal vectors are often not provided as raw data,
therefore, approximation and reorientation are required as pre-processing
stages, both of which can introduce noise. In this paper, we propose a
divergence guided shape representation learning approach that does not require
normal vectors as input. We show that incorporating a soft constraint on the
divergence of the distance function favours smooth solutions that reliably
orients gradients to match the unknown normal at each point, in some cases even
better than approaches that use ground truth normal vectors directly.
Additionally, we introduce a novel geometric initialization method for
sinusoidal shape representation networks that further improves convergence to
the desired solution. We evaluate the effectiveness of our approach on the task
of surface reconstruction and show state-of-the-art performance compared to
other unoriented methods and on-par performance compared to oriented methods
PatchContrast: Self-Supervised Pre-training for 3D Object Detection
Accurately detecting objects in the environment is a key challenge for
autonomous vehicles. However, obtaining annotated data for detection is
expensive and time-consuming. We introduce PatchContrast, a novel
self-supervised point cloud pre-training framework for 3D object detection. We
propose to utilize two levels of abstraction to learn discriminative
representation from unlabeled data: proposal-level and patch-level. The
proposal-level aims at localizing objects in relation to their surroundings,
whereas the patch-level adds information about the internal connections between
the object's components, hence distinguishing between different objects based
on their individual components. We demonstrate how these levels can be
integrated into self-supervised pre-training for various backbones to enhance
the downstream 3D detection task. We show that our method outperforms existing
state-of-the-art models on three commonly-used 3D detection datasets
GoferBot: A Visual Guided Human-Robot Collaborative Assembly System
The current transformation towards smart manufacturing has led to a growing
demand for human-robot collaboration (HRC) in the manufacturing process.
Perceiving and understanding the human co-worker's behaviour introduces
challenges for collaborative robots to efficiently and effectively perform
tasks in unstructured and dynamic environments. Integrating recent data-driven
machine vision capabilities into HRC systems is a logical next step in
addressing these challenges. However, in these cases, off-the-shelf components
struggle due to generalisation limitations. Real-world evaluation is required
in order to fully appreciate the maturity and robustness of these approaches.
Furthermore, understanding the pure-vision aspects is a crucial first step
before combining multiple modalities in order to understand the limitations. In
this paper, we propose GoferBot, a novel vision-based semantic HRC system for a
real-world assembly task. It is composed of a visual servoing module that
reaches and grasps assembly parts in an unstructured multi-instance and dynamic
environment, an action recognition module that performs human action prediction
for implicit communication, and a visual handover module that uses the
perceptual understanding of human behaviour to produce an intuitive and
efficient collaborative assembly experience. GoferBot is a novel assembly
system that seamlessly integrates all sub-modules by utilising implicit
semantic information purely from visual perception
Multi-sensor multi-resolution data fusion modeling
Inspection analysis of 3D objects has progressed significantly due to the evolution of advanced sensors. Current sensors facilitate surface scanning at high or low resolution levels. In the inspection field, data from multi-resolution sensors have significant advantages over single-scale data. However, most data fusion methods are single-scale and are not suitable in their current form for multi-resolution sensors. Currently the main challenge is to integrate the diverse scanned information into a single geometric hierarchical model. In this work, a new approach for data fusion from multi-resolution sensors is presented. In addition, a correction function for data fusion, based on statistic models, for processing highly dense data (low accuracy) with respect to sparse data (high accuracy) is described. The feasibility of the methods is demonstrated on synthetic data that imitates CMM and laser measurements
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Multimodal alignment facilitates the retrieval of instances from one modality
when queried using another. In this paper, we consider a novel setting where
such an alignment is between (i) instruction steps that are depicted as
assembly diagrams (commonly seen in Ikea assembly manuals) and (ii) video
segments from in-the-wild videos; these videos comprising an enactment of the
assembly actions in the real world. To learn this alignment, we introduce a
novel supervised contrastive learning method that learns to align videos with
the subtle details in the assembly diagrams, guided by a set of novel losses.
To study this problem and demonstrate the effectiveness of our method, we
introduce a novel dataset: IAW for Ikea assembly in the wild consisting of 183
hours of videos from diverse furniture assembly collections and nearly 8,300
illustrations from their associated instruction manuals and annotated for their
ground truth alignments. We define two tasks on this dataset: First, nearest
neighbor retrieval between video segments and illustrations, and, second,
alignment of instruction steps and the segments for each video. Extensive
experiments on IAW demonstrate superior performances of our approach against
alternatives.Comment: Project website:
https://academic.davidz.cn/en/publication/zhang-cvpr-2023