10,233 research outputs found
Dynamics simulation of human box delivering task
Thesis (M.S.) University of Alaska Fairbanks, 2018The dynamic optimization of a box delivery motion is a complex task. The key component is to achieve an optimized motion associated with the box weight, delivering speed, and location. This thesis addresses one solution for determining the optimal delivery of a box. The delivering task is divided into five subtasks: lifting, transition step, carrying, transition step, and unloading. Each task is simulated independently with appropriate boundary conditions so that they can be stitched together to render a complete delivering task. Each task is formulated as an optimization problem. The design variables are joint angle profiles. For lifting and carrying task, the objective function is the dynamic effort. The unloading task is a byproduct of the lifting task, but done in reverse, starting with holding the box and ending with it at its final position. In contrast, for transition task, the objective function is the combination of dynamic effort and joint discomfort. The various joint parameters are analyzed consisting of joint torque, joint angles, and ground reactive forces. A viable optimization motion is generated from the simulation results. It is also empirically validated. This research holds significance for professions containing heavy box lifting and delivering tasks and would like to reduce the chance of injury.Chapter 1 Introduction -- Chapter 2 Skeletal Human Modeling -- Chapter 3 Kinematics and Dynamics -- Chapter 4 Lifting Simulation -- Chapter 5 Carrying Simulation -- Chapter 6 Delivering Simulation -- Chapter 7 Conclusion and Future Research -- Reference
In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations
Convolutional Neural Network based approaches for monocular 3D human pose
estimation usually require a large amount of training images with 3D pose
annotations. While it is feasible to provide 2D joint annotations for large
corpora of in-the-wild images with humans, providing accurate 3D annotations to
such in-the-wild corpora is hardly feasible in practice. Most existing 3D
labelled data sets are either synthetically created or feature in-studio
images. 3D pose estimation algorithms trained on such data often have limited
ability to generalize to real world scene diversity. We therefore propose a new
deep learning based method for monocular 3D human pose estimation that shows
high accuracy and generalizes better to in-the-wild scenes. It has a network
architecture that comprises a new disentangled hidden space encoding of
explicit 2D and 3D features, and uses supervision by a new learned projection
model from predicted 3D pose. Our algorithm can be jointly trained on image
data with 3D labels and image data with only 2D labels. It achieves
state-of-the-art accuracy on challenging in-the-wild data.Comment: Accepted to CVPR 201
In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations
Convolutional Neural Network based approaches for monocular 3D human pose estimation usually require a large amount of training images with 3D pose annotations. While it is feasible to provide 2D joint annotations for large corpora of in-the-wild images with humans, providing accurate 3D annotations to such in-the-wild corpora is hardly feasible in practice. Most existing 3D labelled data sets are either synthetically created or feature in-studio images. 3D pose estimation algorithms trained on such data often have limited ability to generalize to real world scene diversity. We therefore propose a new deep learning based method for monocular 3D human pose estimation that shows high accuracy and generalizes better to in-the-wild scenes. It has a network architecture that comprises a new disentangled hidden space encoding of explicit 2D and 3D features, and uses supervision by a new learned projection model from predicted 3D pose. Our algorithm can be jointly trained on image data with 3D labels and image data with only 2D labels. It achieves state-of-the-art accuracy on challenging in-the-wild data
Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation
Direct prediction of 3D body pose and shape remains a challenge even for
highly parameterized deep learning models. Mapping from the 2D image space to
the prediction space is difficult: perspective ambiguities make the loss
function noisy and training data is scarce. In this paper, we propose a novel
approach (Neural Body Fitting (NBF)). It integrates a statistical body model
within a CNN, leveraging reliable bottom-up semantic body part segmentation and
robust top-down body model constraints. NBF is fully differentiable and can be
trained using 2D and 3D annotations. In detailed experiments, we analyze how
the components of our model affect performance, especially the use of part
segmentations as an explicit intermediate representation, and present a robust,
efficiently trainable framework for 3D human pose estimation from 2D images
with competitive results on standard benchmarks. Code will be made available at
http://github.com/mohomran/neural_body_fittingComment: 3DV 201
It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
We address the problem of 3D human pose estimation from 2D input images using
only weakly supervised training data. Despite showing considerable success for
2D pose estimation, the application of supervised machine learning to 3D pose
estimation in real world images is currently hampered by the lack of varied
training images with corresponding 3D poses. Most existing 3D pose estimation
algorithms train on data that has either been collected in carefully controlled
studio settings or has been generated synthetically. Instead, we take a
different approach, and propose a 3D human pose estimation algorithm that only
requires relative estimates of depth at training time. Such training signal,
although noisy, can be easily collected from crowd annotators, and is of
sufficient quality for enabling successful training and evaluation of 3D pose
algorithms. Our results are competitive with fully supervised regression based
approaches on the Human3.6M dataset, despite using significantly weaker
training data. Our proposed algorithm opens the door to using existing
widespread 2D datasets for 3D pose estimation by allowing fine-tuning with
noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at
http://www.vision.caltech.edu/~mronchi/projects/RelativePos
XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera
We present a real-time approach for multi-person 3D motion capture at over 30
fps using a single RGB camera. It operates successfully in generic scenes which
may contain occlusions by objects and by other people. Our method operates in
subsequent stages. The first stage is a convolutional neural network (CNN) that
estimates 2D and 3D pose features along with identity assignments for all
visible joints of all individuals.We contribute a new architecture for this
CNN, called SelecSLS Net, that uses novel selective long and short range skip
connections to improve the information flow allowing for a drastically faster
network without compromising accuracy. In the second stage, a fully connected
neural network turns the possibly partial (on account of occlusion) 2Dpose and
3Dpose features for each subject into a complete 3Dpose estimate per
individual. The third stage applies space-time skeletal model fitting to the
predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose,
and enforce temporal coherence. Our method returns the full skeletal pose in
joint angles for each subject. This is a further key distinction from previous
work that do not produce joint angle results of a coherent skeleton in real
time for multi-person scenes. The proposed system runs on consumer hardware at
a previously unseen speed of more than 30 fps given 512x320 images as input
while achieving state-of-the-art accuracy, which we will demonstrate on a range
of challenging real-world scenes.Comment: To appear in ACM Transactions on Graphics (SIGGRAPH) 202
XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera
We present a real-time approach for multi-person 3D motion capture at over 30 fps using a single RGB camera. It operates in generic scenes and is robust to difficult occlusions both by other people and objects. Our method operates in subsequent stages. The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy. In the second stage, a fully-connected neural network turns the possibly partial (on account of occlusion) 2D pose and 3D pose features for each subject into a complete 3D pose estimate per individual. The third stage applies space-time skeletal model fitting to the predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose, and enforce temporal coherence. Our method returns the full skeletal pose in joint angles for each subject. This is a further key distinction from previous work that neither extracted global body positions nor joint angle results of a coherent skeleton in real time for multi-person scenes. The proposed system runs on consumer hardware at a previously unseen speed of more than 30 fps given 512x320 images as input while achieving state-of-the-art accuracy, which we will demonstrate on a range of challenging real-world scenes
- …