245 research outputs found
Low-latency compression of mocap data using learned spatial decorrelation transform
Due to the growing needs of human motion capture (mocap) in movie, video
games, sports, etc., it is highly desired to compress mocap data for efficient
storage and transmission. This paper presents two efficient frameworks for
compressing human mocap data with low latency. The first framework processes
the data in a frame-by-frame manner so that it is ideal for mocap data
streaming and time critical applications. The second one is clip-based and
provides a flexible tradeoff between latency and compression performance. Since
mocap data exhibits some unique spatial characteristics, we propose a very
effective transform, namely learned orthogonal transform (LOT), for reducing
the spatial redundancy. The LOT problem is formulated as minimizing square
error regularized by orthogonality and sparsity and solved via alternating
iteration. We also adopt a predictive coding and temporal DCT for temporal
decorrelation in the frame- and clip-based frameworks, respectively.
Experimental results show that the proposed frameworks can produce higher
compression performance at lower computational cost and latency than the
state-of-the-art methods.Comment: 15 pages, 9 figure
Human Motion Capture Data Tailored Transform Coding
Human motion capture (mocap) is a widely used technique for digitalizing
human movements. With growing usage, compressing mocap data has received
increasing attention, since compact data size enables efficient storage and
transmission. Our analysis shows that mocap data have some unique
characteristics that distinguish themselves from images and videos. Therefore,
directly borrowing image or video compression techniques, such as discrete
cosine transform, does not work well. In this paper, we propose a novel
mocap-tailored transform coding algorithm that takes advantage of these
features. Our algorithm segments the input mocap sequences into clips, which
are represented in 2D matrices. Then it computes a set of data-dependent
orthogonal bases to transform the matrices to frequency domain, in which the
transform coefficients have significantly less dependency. Finally, the
compression is obtained by entropy coding of the quantized coefficients and the
bases. Our method has low computational cost and can be easily extended to
compress mocap databases. It also requires neither training nor complicated
parameter setting. Experimental results demonstrate that the proposed scheme
significantly outperforms state-of-the-art algorithms in terms of compression
performance and speed
Spatiotemporal Saliency Detection: State of Art
Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made
Cognitive Robots for Social Interactions
One of my goals is to work towards developing Cognitive Robots, especially with regard to improving the functionalities that facilitate the interaction with human beings and their surrounding objects. Any cognitive system designated for serving human beings must be capable of processing the social signals and eventually enable efficient prediction and planning of appropriate responses.
My main focus during my PhD study is to bridge the gap between the motoric space and the visual space. The discovery of the mirror neurons ([RC04]) shows that the visual perception of human motion (visual space) is directly associated to the motor control of the human body (motor space). This discovery poses a large number of challenges in different fields such as computer vision, robotics and neuroscience. One of the fundamental challenges is the understanding of the mapping between 2D visual space and 3D motoric control, and further developing building blocks (primitives) of human motion in the visual space as well as in the motor space.
First, I present my study on the visual-motoric mapping of human actions. This study aims at mapping human actions in 2D videos to 3D skeletal representation. Second, I present an automatic algorithm to decompose motion capture (MoCap) sequences into synergies along with the times at which they are executed (or "activated") for each joint. Third, I proposed to use the Granger Causality as a tool to study the coordinated actions performed by at least two units. Recent scientific studies suggest that the above "action mirroring circuit" might be tuned to action coordination rather than single action mirroring. Fourth, I present the extraction of key poses in visual space. These key poses facilitate the further study of the "action mirroring circuit". I conclude the dissertation by describing the future of cognitive robotics study
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting
We have recently seen tremendous progress in photo-real human modeling and
rendering. Yet, efficiently rendering realistic human performance and
integrating it into the rasterization pipeline remains challenging. In this
paper, we present HiFi4G, an explicit and compact Gaussian-based approach for
high-fidelity human performance rendering from dense footage. Our core
intuition is to marry the 3D Gaussian representation with non-rigid tracking,
achieving a compact and compression-friendly representation. We first propose a
dual-graph mechanism to obtain motion priors, with a coarse deformation graph
for effective initialization and a fine-grained Gaussian graph to enforce
subsequent constraints. Then, we utilize a 4D Gaussian optimization scheme with
adaptive spatial-temporal regularizers to effectively balance the non-rigid
prior and Gaussian updating. We also present a companion compression scheme
with residual compensation for immersive experiences on various platforms. It
achieves a substantial compression rate of approximately 25 times, with less
than 2MB of storage per frame. Extensive experiments demonstrate the
effectiveness of our approach, which significantly outperforms existing
approaches in terms of optimization speed, rendering quality, and storage
overhead
The development of the quaternion wavelet transform
The purpose of this article is to review what has been written on what other authors have called quaternion wavelet transforms (QWTs): there is no consensus about what these should look like and what their properties should be. We briefly explain what real continuous and discrete wavelet transforms and multiresolution analysis are and why complex wavelet transforms were introduced; we then go on to detail published approaches to QWTs and to analyse them. We conclude with our own analysis of what it is that should define a QWT as being truly quaternionic and why all but a few of the “QWTs” we have described do not fit our definition
Spatial and rotation invariant 3D gesture recognition based on sparse representation
International audienceAdvances in motion tracking technology, especially for commodity hardware, still require robust 3D gesture recognition in order to fully exploit the benefits of natural user interfaces. In this paper, we introduce a novel 3D gesture recognition algorithm based on the sparse representation of 3D human motion. The sparse representation of human motion provides a set of features that can be used to efficiently classify gestures in real-time. Compared to existing gesture recognition systems, sparse representation, the proposed approach enables full spatial and rotation invariance and provides high tolerance to noise. Moreover, the proposed classification scheme takes into account the inter-user variability which increases gesture classification accuracy in user-independent scenarios. We validated our approach with existing motion databases for gestu-ral interaction and performed a user evaluation with naive subjects to show its robustness to arbitrarily defined gestures. The results showed that our classification scheme has high classification accuracy for user-independent scenarios even with users who have different handedness. We believe that sparse representation of human motion will pave the way for a new generation of 3D gesture recognition systems in order to fully open the potential of natural user interfaces
- …