549 research outputs found

    On Motion Parameterizations in Image Sequences from Fixed Viewpoints

    Get PDF
    This dissertation addresses the problem of parameterizing object motion within a set of images taken with a stationary camera. We develop data-driven methods across all image scales: characterizing motion observed at the scale of individual pixels, along extended structures such as roads, and whole image deformations such as lungs deforming over time. The primary contributions include: a) fundamental studies of the relationship between spatio-temporal image derivatives accumulated at a pixel, and the object motions at that pixel,: b) data driven approaches to parameterize breath motion and reconstruct lung CT data volumes, and: c) defining and offering initial results for a new class of Partially Unsupervised Manifold Learning: PUML) problems, which often arise in medical imagery. Specifically, we create energy functions for measuring how consistent a given velocity vector is with observed spatio-temporal image derivatives. These energy functions are used to fit parametric snake models to roads using velocity constraints. We create an automatic data-driven technique for finding the breath phase of lung CT scans which is able to replace external belt measurements currently in use clinically. This approach is extended to automatically create a full deformation model of a CT lung volume during breathing or heart MRI during breathing and heartbeat. Additionally, motivated by real use cases, we address a scenario in which a dataset is collected along with meta-data which describes some, but not all, aspects of the dataset. We create an embedding which displays the remaining variability in a dataset after accounting for variability related to the meta-data

    Robust Motion In-betweening

    Full text link
    In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial recurrent neural networks. The system synthesizes high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios. To quantitatively evaluate performance on transitions and generalizations to longer time horizons, we present well-defined in-betweening benchmarks on a subset of the widely used Human3.6M dataset and on LaFAN1, a novel high quality motion capture dataset that is more appropriate for transition generation. We are releasing this new dataset along with this work, with accompanying code for reproducing our baseline results.Comment: Published at SIGGRAPH 202

    Mining Spatial-Temporal Patterns and Structural Sparsity for Human Motion Data Denoising

    Get PDF
    Motion capture is an important technique with a wide range of applications in areas such as computer vision, computer animation, film production, and medical rehabilitation. Even with the professional motion capture systems, the acquired raw data mostly contain inevitable noises and outliers. To denoise the data, numerous methods have been developed, while this problem still remains a challenge due to the high complexity of human motion and the diversity of real-life situations. In this paper, we propose a data-driven-based robust human motion denoising approach by mining the spatial-temporal patterns and the structural sparsity embedded in motion data. We first replace the regularly used entire pose model with a much fine-grained partlet model as feature representation to exploit the abundant local body part posture and movement similarities. Then, a robust dictionary learning algorithm is proposed to learn multiple compact and representative motion dictionaries from the training data in parallel. Finally, we reformulate the human motion denoising problem as a robust structured sparse coding problem in which both the noise distribution information and the temporal smoothness property of human motion have been jointly taken into account. Compared with several state-of-the-art motion denoising methods on both the synthetic and real noisy motion data, our method consistently yields better performance than its counterparts. The outputs of our approach are much more stable than that of the others. In addition, it is much easier to setup the training dataset of our method than that of the other data-driven-based methods

    Learning to Transform Time Series with a Few Examples

    Get PDF
    We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account
    • …
    corecore