25 research outputs found
Point cloud data compression
The rapid growth in the popularity of Augmented Reality (AR), Virtual Reality (VR), and Mixed Reality (MR) experiences have resulted in an exponential surge of three-dimensional data. Point clouds have emerged as a commonly employed representation for capturing and visualizing three-dimensional data in these environments. Consequently, there has been a substantial research effort dedicated to developing efficient compression algorithms for point cloud data. This Master's thesis aims to investigate the current state-of-the-art lossless point cloud geometry compression techniques, explore some of these techniques in more detail and then propose improvements and/or extensions to enhance them and provide directions for future work on this topic
Visual and Geometric Data Compression for Immersive Technologies
The contributions of this thesis are new compression algorithms for light field images and point cloud geometry. Light field imaging attracted wide attention in the recent decade, partly due to emergence of relatively low-cost handheld light field cameras designed for commercial purposes whereas point clouds are used more and more frequently in immersive technologies, replacing other forms of 3D representation. We obtain successful coding performance by combining conventional image processing methods, entropy coding, learning-based disparity estimation and optimization of neural networks for context probability modeling.
On the light field coding side, we develop a lossless light field coding method which uses learning-based disparity estimations to predict any view in a light field from a set of reference views. On the point cloud geometry compression side, we develop four different algorithms. The first two of these algorithms follow the so-called bounding volumes approach which initially represents a part of the point cloud in two depth maps where the remaining points of the cloud are contained in a bounding volume which can be derived using only the two depth maps that are losslessly transmitted. One of the two algorithms is a lossy coder that reconstructs some of the remaining points in several steps which involve conventional image processing and image coding techniques. The other one is a lossless coder which applies a novel context arithmetic coding approach involving gradual expansion of the reconstructed point cloud into neighboring voxels. The last two of the proposed point cloud compression algorithms use neural networks for context probability modeling for coding the octree representation of point clouds using arithmetic coding. One of these two algorithms is a learning-based intra-frame coder which requires an initial training stage on a set of training point clouds. The lastly presented algorithm is an inter-frame (sequence) encoder which incorporates the neural network training into the encoding stage, thus for each sequence of point clouds, a specific neural network model is optimized which is also transmitted as a header in the bitstream
Geometric Prior Based Deep Human Point Cloud Geometry Compression
The emergence of digital avatars has raised an exponential increase in the
demand for human point clouds with realistic and intricate details. The
compression of such data becomes challenging with overwhelming data amounts
comprising millions of points. Herein, we leverage the human geometric prior in
geometry redundancy removal of point clouds, greatly promoting the compression
performance. More specifically, the prior provides topological constraints as
geometry initialization, allowing adaptive adjustments with a compact parameter
set that could be represented with only a few bits. Therefore, we can envisage
high-resolution human point clouds as a combination of geometric priors and
structural deviations. The priors could first be derived with an aligned point
cloud, and subsequently the difference of features is compressed into a compact
latent code. The proposed framework can operate in a play-and-plug fashion with
existing learning based point cloud compression methods. Extensive experimental
results show that our approach significantly improves the compression
performance without deteriorating the quality, demonstrating its promise in a
variety of applications
Compression of dynamic polygonal meshes with constant and variable connectivity
This work was supported by the projects 20-02154S and 17-07690S of the Czech
Science Foundation and SGS-2019-016 of the Czech Ministry of Education.Polygonal mesh sequences with variable connectivity are incredibly versatile dynamic surface representations as they allow a surface to change topology or details
to suddenly appear or disappear. This, however, comes at the cost of large storage size. Current compression methods inefficiently exploit the temporal coherence
of general data because the correspondences between two subsequent frames might
not be bijective. We study the current state of the art including the special class of
mesh sequences for which connectivity is static. We also focus on the state of the
art of a related field of dynamic point cloud sequences. Further, we point out parts
of the compression pipeline with the possibility of improvement. We present the
progress we have already made in designing a temporal model capturing the temporal coherence of the sequence, and point out to directions for a future research
Refining The Bounding Volumes for Lossless Compression of Voxelized Point Clouds Geometry
This paper describes a novel lossless compression method for point cloud geometry, building on a recent lossy compression method that aimed at reconstructing only the bounding volume of a point cloud. The proposed scheme starts by partially reconstructing the geometry from the two depthmaps associated to a single projection direction. The partial reconstruction obtained from the depthmaps is completed to a full reconstruction of the point cloud by sweeping section by section along one direction and encoding the points which were not contained in the two depthmaps. The main ingredient is a list-based encoding of the inner points (situated inside the feasible regions) by a novel arithmetic three dimensional context coding procedure that efficiently utilizes rotational invariances present in the input data. State-of-the-art bits-per-voxel results are obtained on benchmark datasets.acceptedVersionPeer reviewe
On predictive RAHT for dynamic point cloud compression
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2021.O aumento no número de aplicações 3D fez necessária a pesquisa e o desenvolvimento
de padrões para compressão de nuvem de pontos. Visto que nuvens de pontos
representam uma quantidade significativa de dados, padrões de compressão são
essenciais para transmissão e armazenamento eficientes desses formatos. Por esse
motivo, o Moving Pictures Expert Group (MPEG) iniciou atividades de padronização de
technologias para compressão de nuvens de pontos resultando em dois padrões: o
Geometry-based Point Cloud Compression (G-PCC) e o Video-based Point Cloud
Compression (V-PCC). G-PCC foi desenvolvido para compressão de nuvens de pontos
estáticas, aquelas que representam objetos e cenas, e nuvens de pontos adquiridas
dinamicamente, obtidas por technologia LiDAR. Por outro lado, V-PCC foi direcionado
para compressão de nuvens de pontos dinâmicas, aquelas representadas por diversos
quadros temporais semelhantes a sequências de vídeo.
Na compressão de nuvens de pontos dinâmicas, os algoritmos para estimar e compensar
movimento desempenham um papel essencial. Eles permitem que redundâncias
temporais entre quadros sucessivos sejam exploradas, reduzindo significativamente o
número de bits necessários para armazenar e transmitir as cenas dinâmicas. Embora
técnicas de estimação de movimento já tenham sido estudadas, esses algoritmos para
nuvens de pontos ainda são muito complexos e exigem muito poder computacional,
tornando-os inadequados para aplicações práticas com restrições de tempo. Portanto,
uma solução de estimação de movimento eficiente para nuvens de pontos ainda é um
problema de pesquisa em aberto.
Com base nisso, o trabalho apresentado nesta dissertação se concentra em explorar o
uso de uma predição inter-quadros simples ao lado da region-adaptive hierarchical (or
Haar) transform (RAHT). Nosso objetivo é melhorar o desempenho de compressão de
atributos da RAHT para nuvens de pontos dinâmicas usando um algoritmo de predição
inter-quadros de baixa complexidade. Desenvolvemos esquemas simples combinando a
última versão da transformada RAHT com uma etapa preditiva intra-quadros adicionada a
uma predição inter-quadros de baixa complexidade para melhorar o desempenho da
compressão de nuvens de pontos dinâmicas usando a RAHT. Como mencionado
anteriormente, os algoritmos de predição inter-quadros baseados em estimação de
movimento ainda são muito complexos para nuvens de pontos. Por esse motivo, usamos
uma predição inter-quadros com base na proximidade espacial de voxels vizinhos entre
quadros sucessivos. A predição inter-quadros do vizinho mais próximo combina cada
voxel no quadro de nuvem de pontos atual com seu voxel mais próximo no quadro
imediatamente anterior. Por ser um algoritmo simples, ele pode ser implementado de
forma eficiente para aplicações com restrições de tempo.
Finalmente, desenvolvemos duas abordagens adaptativas que combinam a predição inter-
quadros do vizinho mais próximo ao lado da RAHT com predição intra-quadros. A
primeira abordagem desenvolvida é definida como fragment-based multiple decision e a
segunda como level-based multiple decision. Ambos os esquemas são capazes de
superar o uso apenas da predição intra-quadros ao lado da RAHT para compressão de
nuvens de pontos dinâmicas. O algoritmo fragment-based tem um desempenho
ligeiramente melhor se comparado ao uso apenas da predição intra-quadros com ganhos
Bjontegaard delta (BD) PSNR-Y médios de 0,44 dB e economia média de taxa de bits de
10,57%. O esquema level-based foi capaz de atingir ganhos mais substanciais sobre o uso
apenas da predição intra-quadros com ganhos BD PSNR-Y médios de 0,97 dB e economia
média de taxa de bits de 21,73%.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).The increase in 3D applications made necessary the research and development of
standards for point cloud compression. Since point clouds represent a significant amount
of data, compression standards are essential to efficiently transmit and store such data.
For this reason, the Moving Pictures Expert Group (MPEG) started the standardization
activities of point cloud compression algorithms resulting in two standards: the
Geometry-based Point Cloud Compression (G-PCC) and the Video-based Point Cloud
Compression (V-PCC). G-PCC was designed to address static point clouds, those
consisting of objects and scenes, and dynamically acquired point clouds, typically
obtained by LiDAR technology. In contrast, V-PCC was addressed to dynamic point
clouds, those consisting of several temporal frames similar to a video sequence.
In the compression of dynamic point clouds, algorithms to estimate and compensate
motion play an essential role. They allow temporal redundancies among successive
frames to be further explored, hence, significantly reducing the number of bits required to
store and transmit the dynamic scenes. Although motion estimation algorithms have been
studied, those algorithms for points clouds are still very complex and demand plenty of
computational power, making them unsuitable for practical time-constrained applications.
Therefore, an efficient motion estimation solution for point clouds is still an open research
problem.
Based on that, the work presented in this dissertation focuses on exploring the use of a
simple inter-frame prediction alongside the region-adaptive hierarchical (or Haar)
transform (RAHT). Our goal is to improve RAHT's attribute compression performance of
dynamic point clouds using a low-complexity inter-frame prediction algorithm. We devise
simple schemes combining the latest version of RAHT with an intra-frame predictive step
added with a low-complexity inter-frame prediction to improve the compression
performance of dynamic point clouds using RAHT. As previously mentioned, inter-frame
prediction algorithms based on motion estimation are still very complex for point clouds.
For this reason, we use an inter-frame prediction based on the spatial proximity of
neighboring voxels between successive frames. The nearest-neighbor inter-frame
prediction simply matches each voxel in the current point cloud frame to its nearest voxel
in the immediately previous frame. Since it is a straightforward algorithm, it can be
efficiently implemented for time-constrained applications.
Finally, we devised two adaptive approaches that combine the nearest-neighbor prediction
alongside the intra-frame predictive RAHT. The first designed approach is referred to as
fragment-based multiple decision, and the second is referred to as level-based multiple
decision. Both schemes are capable of outperforming the use of only the intra-frame
prediction alongside RAHT in the compression of dynamic point clouds. The fragment-
based algorithm is capable of slightly outperforming the use of only the intra-frame
prediction with average Bjontegaard delta (BD) PSNR-Y gains of 0.44 dB and bitrate
savings of 10.57%. The level-based scheme achieves more substantial gains over the use
of only the intra-frame prediction with average BD PSNR-Y gains of 0.97 dB and bitrate
savings of 21.73%
Fractional super-resolution of voxelized point clouds
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2021.Neste trabalho, apresentamos um método para super-resolver nuvens de pontos por um fator fracionário, utilizando um dicionário construído a partir de auto-similaridades presentes na versão subamostrada. Dada a geometria de uma nuvem de pontos subamostrada , juntamente com o correspondente fator de subamostragem , 1 < ≤ 2, o método proposto determina o conjunto de pontos que podem ter gerado e estima quais desses pontos, de fato, existem em (super resolução). Considerando que a geometria de uma nuvem de pontos é aproximadamente auto similar em diferentes escalas de subamostragem, cria-se um dicionário relacionando a configuração de ocupação da vizinhança com a ocupação de nós-filhos. O dicionário é obtido a partir de nova subamostragem da geometria de entrada utilizando o mesmo fator . Desta forma, leva-se em conta as irregularidades da subamostragem por fatores fracionários no desenvolvimento da super-resolução. A textura da nuvem de pontos é interpolada utilizando a média ponderada das cores de vizinhos adjacentes. Diversos conteúdos de diferentes fontes foram testados e resultados interessantes foram obtidos. Adicionalmente, apresentamos uma aplicação direta do método de super-resolução para melhorar a compressão de nuvens de pontos utilizando o codificador G-PCC do MPEG.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).We present a method to super-resolve voxelized point clouds downsampled by a fractional factor,
using a lookup-table (LUT) constructed from self-similarities from their own downsampled
neighbourhoods. Given a downsampled point cloud geometry , and its corresponding fractional
downsampling factor , 1 < ≤ 2, the proposed method determines the set of positions that may
have generated , and estimates which of these positions were indeed occupied (super resolution). Assuming that the geometry of a point cloud is approximately self-similar at different
scales, a LUT relating downsampled neighbourhood configurations with children occupancy
configurations can be estimated by further downsampling the input point cloud, and by taking into
account the irregular children distribution derived from fractional downsampling. For completeness,
we also interpolate texture by averaging colors from adjacent neighbour voxels. Extensive tests
over different datasets are presented, and interesting results were obtained. We further present a
direct application to improve point cloud compression using MPEG’s G-PCC codec