Search CORE

2 research outputs found

Dilated Temporal Fully-Convolutional Network for Semantic Segmentation of Motion Capture Data

Author: Cheema Noshaba
Du Han
Fischer Klaus
Herrmann Erik
Hosseini Somayeh
Slusallek Philipp
Sprenger Janis
Publication venue
Publication date: 24/06/2018
Field of study

Semantic segmentation of motion capture sequences plays a key part in many data-driven motion synthesis frameworks. It is a preprocessing step in which long recordings of motion capture sequences are partitioned into smaller segments. Afterwards, additional methods like statistical modeling can be applied to each group of structurally-similar segments to learn an abstract motion manifold. The segmentation task however often remains a manual task, which increases the effort and cost of generating large-scale motion databases. We therefore propose an automatic framework for semantic segmentation of motion capture data using a dilated temporal fully-convolutional network. Our model outperforms a state-of-the-art model in action segmentation, as well as three networks for sequence modeling. We further show our model is robust against high noisy training labels.Comment: Eurographics/ ACM SIGGRAPH Symposium on Computer Animation - Posters 2018; $\href{http://people.mpi-inf.mpg.de/~ncheema/SCA2018_poster.pdf}{\textit{Poster can be found here.}}

arXiv.org e-Print Archive

3D human pose estimation with adaptive receptive fields and dilated temporal convolutions

Author: Castillo Eduardo
Jayaraman Shobhna
Peradejordi Irene Font
Shin Michael
Publication venue
Publication date: 28/05/2020
Field of study

In this work, we demonstrate that receptive fields in 3D pose estimation can be effectively specified using optical flow. We introduce adaptive receptive fields, a simple and effective method to aid receptive field selection in pose estimation models based on optical flow inference. We contrast the performance of a benchmark state-of-the-art model running on fixed receptive fields with their adaptive field counterparts. By using a reduced receptive field, our model can process slow-motion sequences (10x longer) 23% faster than the benchmark model running at regular speed. The reduction in computational cost is achieved while producing a pose prediction accuracy to within 0.36% of the benchmark model

arXiv.org e-Print Archive