2 research outputs found
Dilated Temporal Fully-Convolutional Network for Semantic Segmentation of Motion Capture Data
Semantic segmentation of motion capture sequences plays a key part in many
data-driven motion synthesis frameworks. It is a preprocessing step in which
long recordings of motion capture sequences are partitioned into smaller
segments. Afterwards, additional methods like statistical modeling can be
applied to each group of structurally-similar segments to learn an abstract
motion manifold. The segmentation task however often remains a manual task,
which increases the effort and cost of generating large-scale motion databases.
We therefore propose an automatic framework for semantic segmentation of motion
capture data using a dilated temporal fully-convolutional network. Our model
outperforms a state-of-the-art model in action segmentation, as well as three
networks for sequence modeling. We further show our model is robust against
high noisy training labels.Comment: Eurographics/ ACM SIGGRAPH Symposium on Computer Animation - Posters
2018;
$\href{http://people.mpi-inf.mpg.de/~ncheema/SCA2018_poster.pdf}{\textit{Poster
can be found here.}}
3D human pose estimation with adaptive receptive fields and dilated temporal convolutions
In this work, we demonstrate that receptive fields in 3D pose estimation can
be effectively specified using optical flow. We introduce adaptive receptive
fields, a simple and effective method to aid receptive field selection in pose
estimation models based on optical flow inference. We contrast the performance
of a benchmark state-of-the-art model running on fixed receptive fields with
their adaptive field counterparts. By using a reduced receptive field, our
model can process slow-motion sequences (10x longer) 23% faster than the
benchmark model running at regular speed. The reduction in computational cost
is achieved while producing a pose prediction accuracy to within 0.36% of the
benchmark model