Search CORE

245 research outputs found

Low-latency compression of mocap data using learned spatial decorrelation transform

Author: Chau Lap-Pui
He Ying
Hou Junhui
Magnenat-Thalmann Nadia
Publication venue
Publication date: 01/01/2016
Field of study

Due to the growing needs of human motion capture (mocap) in movie, video games, sports, etc., it is highly desired to compress mocap data for efficient storage and transmission. This paper presents two efficient frameworks for compressing human mocap data with low latency. The first framework processes the data in a frame-by-frame manner so that it is ideal for mocap data streaming and time critical applications. The second one is clip-based and provides a flexible tradeoff between latency and compression performance. Since mocap data exhibits some unique spatial characteristics, we propose a very effective transform, namely learned orthogonal transform (LOT), for reducing the spatial redundancy. The LOT problem is formulated as minimizing square error regularized by orthogonality and sparsity and solved via alternating iteration. We also adopt a predictive coding and temporal DCT for temporal decorrelation in the frame- and clip-based frameworks, respectively. Experimental results show that the proposed frameworks can produce higher compression performance at lower computational cost and latency than the state-of-the-art methods.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Human Motion Capture Data Tailored Transform Coding

Author: Chau Lap-Pui
He Ying
Hou Junhui
Magnenat-Thalmann Nadia
Publication venue
Publication date: 17/10/2014
Field of study

Human motion capture (mocap) is a widely used technique for digitalizing human movements. With growing usage, compressing mocap data has received increasing attention, since compact data size enables efficient storage and transmission. Our analysis shows that mocap data have some unique characteristics that distinguish themselves from images and videos. Therefore, directly borrowing image or video compression techniques, such as discrete cosine transform, does not work well. In this paper, we propose a novel mocap-tailored transform coding algorithm that takes advantage of these features. Our algorithm segments the input mocap sequences into clips, which are represented in 2D matrices. Then it computes a set of data-dependent orthogonal bases to transform the matrices to frequency domain, in which the transform coefficients have significantly less dependency. Finally, the compression is obtained by entropy coding of the quantized coefficients and the bases. Our method has low computational cost and can be easily extended to compress mocap databases. It also requires neither training nor complicated parameter setting. Experimental results demonstrate that the proposed scheme significantly outperforms state-of-the-art algorithms in terms of compression performance and speed

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Spatiotemporal Saliency Detection: State of Art

Author: Sultana Kadri, Pooja, Manju Bala
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/04/2016
Field of study

Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made

International Journal on Recent and Innovation Trends in Computing and Communication

Cognitive Robots for Social Interactions

Author: Li Yi
Publication venue
Publication date: 01/01/2010
Field of study

One of my goals is to work towards developing Cognitive Robots, especially with regard to improving the functionalities that facilitate the interaction with human beings and their surrounding objects. Any cognitive system designated for serving human beings must be capable of processing the social signals and eventually enable efficient prediction and planning of appropriate responses. My main focus during my PhD study is to bridge the gap between the motoric space and the visual space. The discovery of the mirror neurons ([RC04]) shows that the visual perception of human motion (visual space) is directly associated to the motor control of the human body (motor space). This discovery poses a large number of challenges in different fields such as computer vision, robotics and neuroscience. One of the fundamental challenges is the understanding of the mapping between 2D visual space and 3D motoric control, and further developing building blocks (primitives) of human motion in the visual space as well as in the motor space. First, I present my study on the visual-motoric mapping of human actions. This study aims at mapping human actions in 2D videos to 3D skeletal representation. Second, I present an automatic algorithm to decompose motion capture (MoCap) sequences into synergies along with the times at which they are executed (or "activated") for each joint. Third, I proposed to use the Granger Causality as a tool to study the coordinated actions performed by at least two units. Recent scientific studies suggest that the above "action mirroring circuit" might be tuned to action coordination rather than single action mirroring. Fourth, I present the extraction of key poses in visual space. These key poses facilitate the further study of the "action mirroring circuit". I conclude the dissertation by describing the future of cognitive robotics study

Digital Repository at the University of Maryland

HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting

Author: Hong Yu
Jiang Yuheng
Shen Zhehao
Su Zhuo
Wang Penghao
Xu Lan
Yu Jingyi
Zhang Yingliang
Publication venue
Publication date: 07/12/2023
Field of study

We have recently seen tremendous progress in photo-real human modeling and rendering. Yet, efficiently rendering realistic human performance and integrating it into the rasterization pipeline remains challenging. In this paper, we present HiFi4G, an explicit and compact Gaussian-based approach for high-fidelity human performance rendering from dense footage. Our core intuition is to marry the 3D Gaussian representation with non-rigid tracking, achieving a compact and compression-friendly representation. We first propose a dual-graph mechanism to obtain motion priors, with a coarse deformation graph for effective initialization and a fine-grained Gaussian graph to enforce subsequent constraints. Then, we utilize a 4D Gaussian optimization scheme with adaptive spatial-temporal regularizers to effectively balance the non-rigid prior and Gaussian updating. We also present a companion compression scheme with residual compensation for immersive experiences on various platforms. It achieves a substantial compression rate of approximately 25 times, with less than 2MB of storage per frame. Extensive experiments demonstrate the effectiveness of our approach, which significantly outperforms existing approaches in terms of optimization speed, rendering quality, and storage overhead

arXiv.org e-Print Archive

The development of the quaternion wavelet transform

Author: Bahri
Bahri
Bahri
Bayro-Corrochano
Bayro-Corrochano
Brackx
Chan
Chui
Cohen
Daubechies
Ell
Felsberg
Gabor
Gabor
Gai
Gai
Gai
Gai
Gai
Gai
Gai
Geng
Ginzberg
Goupillaud
Grossmann
Guo
He
He
Hogan
Hogan
Kadiri
Katunin
Kovačević
Kumar
Lei
Lei
Li
Lilly
Liu
Liu
Liu
Liu
Mallat
Mallat
Olhede
P. Fletcher
Pinsky
Priyadharshini
Ricker
S.J. Sangwine
Sangwine
Sathyabama
Selesnick
Soulard
Taubman
Traoré
Unser
Ville
Wang
Xia
Yin
Zhao
Zhou
Zweig
Publication venue: 'Elsevier BV'
Publication date: 29/12/2016
Field of study

The purpose of this article is to review what has been written on what other authors have called quaternion wavelet transforms (QWTs): there is no consensus about what these should look like and what their properties should be. We briefly explain what real continuous and discrete wavelet transforms and multiresolution analysis are and why complex wavelet transforms were introduced; we then go on to detail published approaches to QWTs and to analyse them. We conclude with our own analysis of what it is that should define a QWT as being truly quaternionic and why all but a few of the “QWTs” we have described do not fit our definition

University of Essex Research Repository

Crossref

Spatial and rotation invariant 3D gesture recognition based on sparse representation

Author: Argelaguet Sanz Ferran
Ducoffe Mélanie
Gribonval Rémi
Lécuyer Anatole
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

International audienceAdvances in motion tracking technology, especially for commodity hardware, still require robust 3D gesture recognition in order to fully exploit the benefits of natural user interfaces. In this paper, we introduce a novel 3D gesture recognition algorithm based on the sparse representation of 3D human motion. The sparse representation of human motion provides a set of features that can be used to efficiently classify gestures in real-time. Compared to existing gesture recognition systems, sparse representation, the proposed approach enables full spatial and rotation invariance and provides high tolerance to noise. Moreover, the proposed classification scheme takes into account the inter-user variability which increases gesture classification accuracy in user-independent scenarios. We validated our approach with existing motion databases for gestu-ral interaction and performed a user evaluation with naive subjects to show its robustness to arbitrarily defined gestures. The results showed that our classification scheme has high classification accuracy for user-independent scenarios even with users who have different handedness. We believe that sparse representation of human motion will pave the way for a new generation of 3D gesture recognition systems in order to fully open the potential of natural user interfaces

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1