2 research outputs found

    Dilated Temporal Fully-Convolutional Network for Semantic Segmentation of Motion Capture Data

    Full text link
    Semantic segmentation of motion capture sequences plays a key part in many data-driven motion synthesis frameworks. It is a preprocessing step in which long recordings of motion capture sequences are partitioned into smaller segments. Afterwards, additional methods like statistical modeling can be applied to each group of structurally-similar segments to learn an abstract motion manifold. The segmentation task however often remains a manual task, which increases the effort and cost of generating large-scale motion databases. We therefore propose an automatic framework for semantic segmentation of motion capture data using a dilated temporal fully-convolutional network. Our model outperforms a state-of-the-art model in action segmentation, as well as three networks for sequence modeling. We further show our model is robust against high noisy training labels.Comment: Eurographics/ ACM SIGGRAPH Symposium on Computer Animation - Posters 2018; $\href{http://people.mpi-inf.mpg.de/~ncheema/SCA2018_poster.pdf}{\textit{Poster can be found here.}}

    3D human pose estimation with adaptive receptive fields and dilated temporal convolutions

    Full text link
    In this work, we demonstrate that receptive fields in 3D pose estimation can be effectively specified using optical flow. We introduce adaptive receptive fields, a simple and effective method to aid receptive field selection in pose estimation models based on optical flow inference. We contrast the performance of a benchmark state-of-the-art model running on fixed receptive fields with their adaptive field counterparts. By using a reduced receptive field, our model can process slow-motion sequences (10x longer) 23% faster than the benchmark model running at regular speed. The reduction in computational cost is achieved while producing a pose prediction accuracy to within 0.36% of the benchmark model
    corecore