444 research outputs found
Deep deformable models for 3D human body
Deformable models are powerful tools for modelling the 3D shape variations for a class of objects. However, currently the application and performance of deformable models for human body are restricted due to the limitations in current 3D datasets, annotations, and the model formulation itself. In this thesis, we address the issue by making the following contributions in the field of 3D human body modelling, monocular reconstruction and data collection/annotation.
Firstly, we propose a deep mesh convolutional network based deformable model for 3D human body. We demonstrate the merit of this model in the task of monocular human mesh recovery. While outperforming current state of the art models in mesh recovery accuracy, the model is also light weighted and more flexible as it can be trained end-to-end and fine-tuned for a specific task.
A second contribution is a bone level skinned model of 3D human mesh, in which bone modelling and identity-specific variation modelling are decoupled. Such formulation allows the use of mesh convolutional networks for capturing detailed identity specific variations, while explicitly controlling and modelling the pose variations through linear blend skinning with built-in motion constraints. This formulation not only significantly increases the accuracy in 3D human mesh reconstruction, but also facilitates accurate in the wild character animation and retargetting.
Finally we present a large scale dataset of over 1.3 million 3D human body scans in daily clothing. The dataset contains over 12 hours of 4D recordings at 30 FPS, consisting of 7566 dynamic sequences of 3D meshes from 4205 subjects. We propose a fast and accurate sequence registration pipeline which facilitates markerless motion capture and automatic dense annotation for the raw scans, leading to automatic synthetic image and annotation generation that boosts the performance for tasks such as monocular human mesh reconstruction.Open Acces
Characterization of a RS-LiDAR for 3D Perception
High precision 3D LiDARs are still expensive and hard to acquire. This paper
presents the characteristics of RS-LiDAR, a model of low-cost LiDAR with
sufficient supplies, in comparison with VLP-16. The paper also provides a set
of evaluations to analyze the characterizations and performances of LiDARs
sensors. This work analyzes multiple properties, such as drift effects,
distance effects, color effects and sensor orientation effects, in the context
of 3D perception. By comparing with Velodyne LiDAR, we found RS-LiDAR as a
cheaper and acquirable substitute of VLP-16 with similar efficiency.Comment: For ICRA201
Curriculum Graph Machine Learning: A Survey
Graph machine learning has been extensively studied in both academia and
industry. However, in the literature, most existing graph machine learning
models are designed to conduct training with data samples in a random order,
which may suffer from suboptimal performance due to ignoring the importance of
different graph data samples and their training orders for the model
optimization status. To tackle this critical problem, curriculum graph machine
learning (Graph CL), which integrates the strength of graph machine learning
and curriculum learning, arises and attracts an increasing amount of attention
from the research community. Therefore, in this paper, we comprehensively
overview approaches on Graph CL and present a detailed survey of recent
advances in this direction. Specifically, we first discuss the key challenges
of Graph CL and provide its formal problem definition. Then, we categorize and
summarize existing methods into three classes based on three kinds of graph
machine learning tasks, i.e., node-level, link-level, and graph-level tasks.
Finally, we share our thoughts on future research directions. To the best of
our knowledge, this paper is the first survey for curriculum graph machine
learning.Comment: IJCAI 2023 Survey Trac
GDN: A Stacking Network Used for Skin Cancer Diagnosis
Skin cancer, the primary type of cancer that can be identified by visual
recognition, requires an automatic identification system that can accurately
classify different types of lesions. This paper presents GoogLe-Dense Network
(GDN), which is an image-classification model to identify two types of skin
cancer, Basal Cell Carcinoma, and Melanoma. GDN uses stacking of different
networks to enhance the model performance. Specifically, GDN consists of two
sequential levels in its structure. The first level performs basic
classification tasks accomplished by GoogLeNet and DenseNet, which are trained
in parallel to enhance efficiency. To avoid low accuracy and long training
time, the second level takes the output of the GoogLeNet and DenseNet as the
input for a logistic regression model. We compare our method with four baseline
networks including ResNet, VGGNet, DenseNet, and GoogLeNet on the dataset, in
which GoogLeNet and DenseNet significantly outperform ResNet and VGGNet. In the
second level, different stacking methods such as perceptron, logistic
regression, SVM, decision trees and K-neighbor are studied in which Logistic
Regression shows the best prediction result among all. The results prove that
GDN, compared to a single network structure, has higher accuracy in optimizing
skin cancer detection.Comment: Published at ICSPS 202
- …