180 research outputs found
Visual and Geometric Data Compression for Immersive Technologies
The contributions of this thesis are new compression algorithms for light field images and point cloud geometry. Light field imaging attracted wide attention in the recent decade, partly due to emergence of relatively low-cost handheld light field cameras designed for commercial purposes whereas point clouds are used more and more frequently in immersive technologies, replacing other forms of 3D representation. We obtain successful coding performance by combining conventional image processing methods, entropy coding, learning-based disparity estimation and optimization of neural networks for context probability modeling.
On the light field coding side, we develop a lossless light field coding method which uses learning-based disparity estimations to predict any view in a light field from a set of reference views. On the point cloud geometry compression side, we develop four different algorithms. The first two of these algorithms follow the so-called bounding volumes approach which initially represents a part of the point cloud in two depth maps where the remaining points of the cloud are contained in a bounding volume which can be derived using only the two depth maps that are losslessly transmitted. One of the two algorithms is a lossy coder that reconstructs some of the remaining points in several steps which involve conventional image processing and image coding techniques. The other one is a lossless coder which applies a novel context arithmetic coding approach involving gradual expansion of the reconstructed point cloud into neighboring voxels. The last two of the proposed point cloud compression algorithms use neural networks for context probability modeling for coding the octree representation of point clouds using arithmetic coding. One of these two algorithms is a learning-based intra-frame coder which requires an initial training stage on a set of training point clouds. The lastly presented algorithm is an inter-frame (sequence) encoder which incorporates the neural network training into the encoding stage, thus for each sequence of point clouds, a specific neural network model is optimized which is also transmitted as a header in the bitstream
Network streaming and compression for mixed reality tele-immersion
Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor
Geometric Prior Based Deep Human Point Cloud Geometry Compression
The emergence of digital avatars has raised an exponential increase in the
demand for human point clouds with realistic and intricate details. The
compression of such data becomes challenging with overwhelming data amounts
comprising millions of points. Herein, we leverage the human geometric prior in
geometry redundancy removal of point clouds, greatly promoting the compression
performance. More specifically, the prior provides topological constraints as
geometry initialization, allowing adaptive adjustments with a compact parameter
set that could be represented with only a few bits. Therefore, we can envisage
high-resolution human point clouds as a combination of geometric priors and
structural deviations. The priors could first be derived with an aligned point
cloud, and subsequently the difference of features is compressed into a compact
latent code. The proposed framework can operate in a play-and-plug fashion with
existing learning based point cloud compression methods. Extensive experimental
results show that our approach significantly improves the compression
performance without deteriorating the quality, demonstrating its promise in a
variety of applications
Point cloud data compression
The rapid growth in the popularity of Augmented Reality (AR), Virtual Reality (VR), and Mixed Reality (MR) experiences have resulted in an exponential surge of three-dimensional data. Point clouds have emerged as a commonly employed representation for capturing and visualizing three-dimensional data in these environments. Consequently, there has been a substantial research effort dedicated to developing efficient compression algorithms for point cloud data. This Master's thesis aims to investigate the current state-of-the-art lossless point cloud geometry compression techniques, explore some of these techniques in more detail and then propose improvements and/or extensions to enhance them and provide directions for future work on this topic
A practical comparison between two powerful PCC codec’s
Recent advances in the consumption of 3D content creates the necessity of efficient ways to
visualize and transmit 3D content. As a result, methods to obtain that same content have
been evolving, leading to the development of new methods of representations, namely point
clouds and light fields. A point cloud represents a set of points with associated Cartesian coordinates associated with each point(x, y, z), as well as being able to contain even more information inside that point (color, material, texture, etc). This kind of representation changes
the way on how 3D content in consumed, having a wide range of applications, from videogaming to medical ones. However, since this type of data carries so much information within
itself, they are data-heavy, making the storage and transmission of content a daunting task.
To resolve this issue, MPEG created a point cloud coding normalization project, giving birth
to V-PCC (Video-based Point Cloud Coding) and G-PCC (Geometry-based Point Cloud Coding) for static content. Firstly, a general analysis of point clouds is made, spanning from their
possible solutions, to their acquisition. Secondly, point cloud codecs are studied, namely VPCC and G-PCC from MPEG. Then, a state of art study of quality evaluation is performed,
namely subjective and objective evaluation. Finally, a report on the JPEG Pleno Point Cloud,
in which an active colaboration took place, is made, with the comparative results of the two
codecs and used metrics.Os avanços recentes no consumo de conteúdo 3D vêm criar a necessidade de maneiras eficientes de visualizar e transmitir conteúdo 3D. Consequentemente, os métodos de obtenção
desse mesmo conteúdo têm vindo a evoluir, levando ao desenvolvimento de novas maneiras
de representação, nomeadamente point clouds e lightfields. Um point cloud (núvem de pontos) representa um conjunto de pontos com coordenadas cartesianas associadas a cada ponto
(x, y, z), além de poder conter mais informação dentro do mesmo (cor, material, textura,
etc). Este tipo de representação abre uma nova janela na maneira como se consome conteúdo 3D, tendo um elevado leque de aplicações, desde videojogos e realidade virtual a aplicações médicas. No entanto, este tipo de dados, ao carregarem com eles tanta informação,
tornam-se incrivelmente pesados, tornando o seu armazenamento e transmissão uma tarefa
hercúleana. Tendo isto em mente, a MPEG criou um projecto de normalização de codificação de point clouds, dando origem ao V-PCC (Video-based Point Cloud Coding) e G-PCC
(Geometry-based Point Cloud Coding) para conteúdo estático. Esta dissertação tem como
objectivo uma análise geral sobre os point clouds, indo desde as suas possívei utilizações
à sua aquisição. Seguidamente, é efectuado um estudo dos codificadores de point clouds,
nomeadamente o V-PCC e o G-PCC da MPEG, o estado da arte da avaliação de qualidade, objectiva e subjectiva, e finalmente, são reportadas as actividades da JPEG Pleno Point Cloud,
na qual se teve uma colaboração activa
Aggressive saliency-aware point cloud compression
The increasing demand for accurate representations of 3D scenes, combined
with immersive technologies has led point clouds to extensive popularity.
However, quality point clouds require a large amount of data and therefore the
need for compression methods is imperative. In this paper, we present a novel,
geometry-based, end-to-end compression scheme, that combines information on the
geometrical features of the point cloud and the user's position, achieving
remarkable results for aggressive compression schemes demanding very small bit
rates. After separating visible and non-visible points, four saliency maps are
calculated, utilizing the point cloud's geometry and distance from the user,
the visibility information, and the user's focus point. A combination of these
maps results in a final saliency map, indicating the overall significance of
each point and therefore quantizing different regions with a different number
of bits during the encoding process. The decoder reconstructs the point cloud
making use of delta coordinates and solving a sparse linear system. Evaluation
studies and comparisons with the geometry-based point cloud compression (G-PCC)
algorithm by the Moving Picture Experts Group (MPEG), carried out for a variety
of point clouds, demonstrate that the proposed method achieves significantly
better results for small bit rates
- …