72,277 research outputs found
Globally-Coordinated Locally-Linear Modeling of Multi-Dimensional Data
This thesis considers the problem of modeling and analysis of continuous, locally-linear, multi-dimensional spatio-temporal data. Our work extends the previously reported theoretical work on the global coordination model to temporal analysis of continuous, multi-dimensional data. We have developed algorithms for time-varying data analysis and used them in full-scale, real-world applications. The applications demonstrated in this thesis include tracking, synthesis, recognitions and retrieval of dynamic objects based on their shape, appearance and motion. The proposed approach in this thesis has advantages over existing approaches to analyzing complex spatio-temporal data. Experiments show that the new modeling features of our approach improve the performance of existing approaches in many applications. In object tracking, our approach is the first one to track nonlinear appearance variations by using low-dimensional representation of the appearance change in globally-coordinated linear subspaces. In dynamic texture synthesis, we are able to model non-stationary dynamic textures, which cannot be handled by any of the existing approaches. In human motion synthesis, we show that realistic synthesis can be performed without using specific transition points, or key frames
Dressing Avatars: Deep Photorealistic Appearance for Physically Simulated Clothing
Despite recent progress in developing animatable full-body avatars, realistic
modeling of clothing - one of the core aspects of human self-expression -
remains an open challenge. State-of-the-art physical simulation methods can
generate realistically behaving clothing geometry at interactive rates.
Modeling photorealistic appearance, however, usually requires physically-based
rendering which is too expensive for interactive applications. On the other
hand, data-driven deep appearance models are capable of efficiently producing
realistic appearance, but struggle at synthesizing geometry of highly dynamic
clothing and handling challenging body-clothing configurations. To this end, we
introduce pose-driven avatars with explicit modeling of clothing that exhibit
both photorealistic appearance learned from real-world data and realistic
clothing dynamics. The key idea is to introduce a neural clothing appearance
model that operates on top of explicit geometry: at training time we use
high-fidelity tracking, whereas at animation time we rely on physically
simulated geometry. Our core contribution is a physically-inspired appearance
network, capable of generating photorealistic appearance with view-dependent
and dynamic shadowing effects even for unseen body-clothing configurations. We
conduct a thorough evaluation of our model and demonstrate diverse animation
results on several subjects and different types of clothing. Unlike previous
work on photorealistic full-body avatars, our approach can produce much richer
dynamics and more realistic deformations even for many examples of loose
clothing. We also demonstrate that our formulation naturally allows clothing to
be used with avatars of different people while staying fully animatable, thus
enabling, for the first time, photorealistic avatars with novel clothing.Comment: SIGGRAPH Asia 2022 (ACM ToG) camera ready. The supplementary video
can be found on
https://research.facebook.com/publications/dressing-avatars-deep-photorealistic-appearance-for-physically-simulated-clothing
A Deep-structured Conditional Random Field Model for Object Silhouette Tracking
In this work, we introduce a deep-structured conditional random field
(DS-CRF) model for the purpose of state-based object silhouette tracking. The
proposed DS-CRF model consists of a series of state layers, where each state
layer spatially characterizes the object silhouette at a particular point in
time. The interactions between adjacent state layers are established by
inter-layer connectivity dynamically determined based on inter-frame optical
flow. By incorporate both spatial and temporal context in a dynamic fashion
within such a deep-structured probabilistic graphical model, the proposed
DS-CRF model allows us to develop a framework that can accurately and
efficiently track object silhouettes that can change greatly over time, as well
as under different situations such as occlusion and multiple targets within the
scene. Experiment results using video surveillance datasets containing
different scenarios such as occlusion and multiple targets showed that the
proposed DS-CRF approach provides strong object silhouette tracking performance
when compared to baseline methods such as mean-shift tracking, as well as
state-of-the-art methods such as context tracking and boosted particle
filtering.Comment: 17 page
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- …