10,444 research outputs found
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
Study of Subjective and Objective Quality Evaluation of 3D Point Cloud Data by the JPEG Committee
The SC29/WG1 (JPEG) Committee within ISO/IEC is currently working on developing standards for the storage, compression and transmission of 3D point cloud information. To support the creation of these standards, the committee has created a database of 3D point clouds representing various quality levels and use-cases and examined a range of 2D and 3D objective quality measures. The examined quality measures are correlated with subjective judgments for a number of compression levels. In this paper we describe the database created, tests performed and key observations on the problems of 3D point cloud quality assessment
Image compression based on 2D Discrete Fourier Transform and matrix minimization algorithm
In the present era of the internet and multimedia, image compression techniques are essential to improve image and video performance in terms of storage space, network bandwidth usage, and secure transmission. A number of image compression methods are available with largely differing compression ratios and coding complexity. In this paper we propose a new method for compressing high-resolution images based on the Discrete Fourier Transform (DFT) and Matrix Minimization (MM) algorithm. The method consists of transforming an image by DFT yielding the real and imaginary components. A quantization process is applied to both components independently aiming at increasing the number of high frequency coefficients. The real component matrix is separated into Low Frequency Coefficients (LFC) and High Frequency Coefficients (HFC). Finally, the MM algorithm followed by arithmetic coding is applied to the LFC and HFC matrices. The decompression algorithm decodes the data in reverse order. A sequential search algorithm is used to decode the data from the MM matrix. Thereafter, all decoded LFC and HFC values are combined into one matrix followed by the inverse DFT. Results demonstrate that the proposed method yields high compression ratios over 98% for structured light images with good image reconstruction. Moreover, it is shown that the proposed method compares favorably with the JPEG technique based on compression ratios and image quality
Deep Learning-based Compressed Domain Multimedia for Man and Machine: A Taxonomy and Application to Point Cloud Classification
In the current golden age of multimedia, human visualization is no longer the
single main target, with the final consumer often being a machine which
performs some processing or computer vision tasks. In both cases, deep learning
plays a undamental role in extracting features from the multimedia
representation data, usually producing a compressed representation referred to
as latent representation. The increasing development and adoption of deep
learning-based solutions in a wide area of multimedia applications have opened
an exciting new vision where a common compressed multimedia representation is
used for both man and machine. The main benefits of this vision are two-fold:
i) improved performance for the computer vision tasks, since the effects of
coding artifacts are mitigated; and ii) reduced computational complexity, since
prior decoding is not required. This paper proposes the first taxonomy for
designing compressed domain computer vision solutions driven by the
architecture and weights compatibility with an available spatio-temporal
computer vision processor. The potential of the proposed taxonomy is
demonstrated for the specific case of point cloud classification by designing
novel compressed domain processors using the JPEG Pleno Point Cloud Coding
standard under development and adaptations of the PointGrid classifier.
Experimental results show that the designed compressed domain point cloud
classification solutions can significantly outperform the spatial-temporal
domain classification benchmarks when applied to the decompressed data,
containing coding artifacts, and even surpass their performance when applied to
the original uncompressed data
Point cloud geometry compression using neural implicit representations
openIn recent years, the increasing prominence of 3D point clouds in various applications has led to an escalating need for efficient storage and transmission methods. The sheer size of these point cloud datasets presents challenges in rendering, transmission, and general usability. This thesis introduces a novel approach to point cloud geometry compression leveraging neural implicit representations, specifically through the use of a DiGS network model. By training this model on a single point cloud, we achieve a compact neural representation of its geometry. Notably, this representation allows for the reconstruction of the point cloud with an arbitrary resolution. After training a reconstructing network, dynamic quantization is applied on the trained weights, significantly reducing its overall bitrate without strongly compromising the quality of the reconstructed point cloud. A dequantization is then used to rebuild a high-fidelity representation of the original point cloud. Our experimental results demonstrate the efficacy of this approach in terms of compression ratios and reconstruction quality, assessed using PSNR relative to the bitrate. This research provides a promising direction for efficient point cloud geometry storage and transmission, addressing some of the growing demands of the 3D data era
Deep Learning-Based Compressed Domain Multimedia for Man and Machine: A Taxonomy and Application to Point Cloud Classification
info:eu-repo/semantics/publishedVersio
- …