40 research outputs found

    Block-based motion estimation speedup for dynamic voxelized point clouds

    Get PDF
    Motion estimation is a key component in dynamic point cloud analysis and compression. We present a method for reducing motion estimation computation when processing block-based partitions of temporally adjacent point clouds. We propose the use of an occupancy map containing information regarding size or other higher-order local statistics of the partitions. By consulting the map, the estimator may significantly reduce its search space, avoiding expensive block-matching evaluations. To form the maps we use 3D moment descriptors efficiently computed with one-pass update formulas and stored as scalar-values for multiple, subsequent references. Results show that a speedup of 2 produces a maximum distortion dropoff of less than 2% for the adopted PSNR-based metrics, relative to distortion of predictions attained from full search. Speedups of 5 and 10 are achievable with small average distortion dropoffs, less than 3% and 5%, respectively, for the tested data set

    Point cloud data compression

    Get PDF
    The rapid growth in the popularity of Augmented Reality (AR), Virtual Reality (VR), and Mixed Reality (MR) experiences have resulted in an exponential surge of three-dimensional data. Point clouds have emerged as a commonly employed representation for capturing and visualizing three-dimensional data in these environments. Consequently, there has been a substantial research effort dedicated to developing efficient compression algorithms for point cloud data. This Master's thesis aims to investigate the current state-of-the-art lossless point cloud geometry compression techniques, explore some of these techniques in more detail and then propose improvements and/or extensions to enhance them and provide directions for future work on this topic

    On predictive RAHT for dynamic point cloud compression

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2021.O aumento no número de aplicações 3D fez necessária a pesquisa e o desenvolvimento de padrões para compressão de nuvem de pontos. Visto que nuvens de pontos representam uma quantidade significativa de dados, padrões de compressão são essenciais para transmissão e armazenamento eficientes desses formatos. Por esse motivo, o Moving Pictures Expert Group (MPEG) iniciou atividades de padronização de technologias para compressão de nuvens de pontos resultando em dois padrões: o Geometry-based Point Cloud Compression (G-PCC) e o Video-based Point Cloud Compression (V-PCC). G-PCC foi desenvolvido para compressão de nuvens de pontos estáticas, aquelas que representam objetos e cenas, e nuvens de pontos adquiridas dinamicamente, obtidas por technologia LiDAR. Por outro lado, V-PCC foi direcionado para compressão de nuvens de pontos dinâmicas, aquelas representadas por diversos quadros temporais semelhantes a sequências de vídeo. Na compressão de nuvens de pontos dinâmicas, os algoritmos para estimar e compensar movimento desempenham um papel essencial. Eles permitem que redundâncias temporais entre quadros sucessivos sejam exploradas, reduzindo significativamente o número de bits necessários para armazenar e transmitir as cenas dinâmicas. Embora técnicas de estimação de movimento já tenham sido estudadas, esses algoritmos para nuvens de pontos ainda são muito complexos e exigem muito poder computacional, tornando-os inadequados para aplicações práticas com restrições de tempo. Portanto, uma solução de estimação de movimento eficiente para nuvens de pontos ainda é um problema de pesquisa em aberto. Com base nisso, o trabalho apresentado nesta dissertação se concentra em explorar o uso de uma predição inter-quadros simples ao lado da region-adaptive hierarchical (or Haar) transform (RAHT). Nosso objetivo é melhorar o desempenho de compressão de atributos da RAHT para nuvens de pontos dinâmicas usando um algoritmo de predição inter-quadros de baixa complexidade. Desenvolvemos esquemas simples combinando a última versão da transformada RAHT com uma etapa preditiva intra-quadros adicionada a uma predição inter-quadros de baixa complexidade para melhorar o desempenho da compressão de nuvens de pontos dinâmicas usando a RAHT. Como mencionado anteriormente, os algoritmos de predição inter-quadros baseados em estimação de movimento ainda são muito complexos para nuvens de pontos. Por esse motivo, usamos uma predição inter-quadros com base na proximidade espacial de voxels vizinhos entre quadros sucessivos. A predição inter-quadros do vizinho mais próximo combina cada voxel no quadro de nuvem de pontos atual com seu voxel mais próximo no quadro imediatamente anterior. Por ser um algoritmo simples, ele pode ser implementado de forma eficiente para aplicações com restrições de tempo. Finalmente, desenvolvemos duas abordagens adaptativas que combinam a predição inter- quadros do vizinho mais próximo ao lado da RAHT com predição intra-quadros. A primeira abordagem desenvolvida é definida como fragment-based multiple decision e a segunda como level-based multiple decision. Ambos os esquemas são capazes de superar o uso apenas da predição intra-quadros ao lado da RAHT para compressão de nuvens de pontos dinâmicas. O algoritmo fragment-based tem um desempenho ligeiramente melhor se comparado ao uso apenas da predição intra-quadros com ganhos Bjontegaard delta (BD) PSNR-Y médios de 0,44 dB e economia média de taxa de bits de 10,57%. O esquema level-based foi capaz de atingir ganhos mais substanciais sobre o uso apenas da predição intra-quadros com ganhos BD PSNR-Y médios de 0,97 dB e economia média de taxa de bits de 21,73%.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).The increase in 3D applications made necessary the research and development of standards for point cloud compression. Since point clouds represent a significant amount of data, compression standards are essential to efficiently transmit and store such data. For this reason, the Moving Pictures Expert Group (MPEG) started the standardization activities of point cloud compression algorithms resulting in two standards: the Geometry-based Point Cloud Compression (G-PCC) and the Video-based Point Cloud Compression (V-PCC). G-PCC was designed to address static point clouds, those consisting of objects and scenes, and dynamically acquired point clouds, typically obtained by LiDAR technology. In contrast, V-PCC was addressed to dynamic point clouds, those consisting of several temporal frames similar to a video sequence. In the compression of dynamic point clouds, algorithms to estimate and compensate motion play an essential role. They allow temporal redundancies among successive frames to be further explored, hence, significantly reducing the number of bits required to store and transmit the dynamic scenes. Although motion estimation algorithms have been studied, those algorithms for points clouds are still very complex and demand plenty of computational power, making them unsuitable for practical time-constrained applications. Therefore, an efficient motion estimation solution for point clouds is still an open research problem. Based on that, the work presented in this dissertation focuses on exploring the use of a simple inter-frame prediction alongside the region-adaptive hierarchical (or Haar) transform (RAHT). Our goal is to improve RAHT's attribute compression performance of dynamic point clouds using a low-complexity inter-frame prediction algorithm. We devise simple schemes combining the latest version of RAHT with an intra-frame predictive step added with a low-complexity inter-frame prediction to improve the compression performance of dynamic point clouds using RAHT. As previously mentioned, inter-frame prediction algorithms based on motion estimation are still very complex for point clouds. For this reason, we use an inter-frame prediction based on the spatial proximity of neighboring voxels between successive frames. The nearest-neighbor inter-frame prediction simply matches each voxel in the current point cloud frame to its nearest voxel in the immediately previous frame. Since it is a straightforward algorithm, it can be efficiently implemented for time-constrained applications. Finally, we devised two adaptive approaches that combine the nearest-neighbor prediction alongside the intra-frame predictive RAHT. The first designed approach is referred to as fragment-based multiple decision, and the second is referred to as level-based multiple decision. Both schemes are capable of outperforming the use of only the intra-frame prediction alongside RAHT in the compression of dynamic point clouds. The fragment- based algorithm is capable of slightly outperforming the use of only the intra-frame prediction with average Bjontegaard delta (BD) PSNR-Y gains of 0.44 dB and bitrate savings of 10.57%. The level-based scheme achieves more substantial gains over the use of only the intra-frame prediction with average BD PSNR-Y gains of 0.97 dB and bitrate savings of 21.73%

    Point based graphics rendering with unified scalability solutions.

    Get PDF
    Standard real-time 3D graphics rendering algorithms use brute force polygon rendering, with complexity linear in the number of polygons and little regard for limiting processing to data that contributes to the image. Modern hardware can now render smaller scenes to pixel levels of detail, relaxing surface connectivity requirements. Sub-linear scalability optimizations are typically self-contained, requiring specific data structures, without shared functions and data. A new point based rendering algorithm 'Canopy' is investigated that combines multiple typically sub-linear scalability solutions, using a small core of data structures. Specifically, locale management, hierarchical view volume culling, backface culling, occlusion culling, level of detail and depth ordering are addressed. To demonstrate versatility further, shadows and collision detection are examined. Polygon models are voxelized with interpolated attributes to provide points. A scene tree is constructed, based on a BSP tree of points, with compressed attributes. The scene tree is embedded in a compressed, partitioned, procedurally based scene graph architecture that mimics conventional systems with groups, instancing, inlines and basic read on demand rendering from backing store. Hierarchical scene tree refinement constructs an image tree image space equivalent, with object space scene node points projected, forming image node equivalents. An image graph of image nodes is maintained, describing image and object space occlusion relationships, hierarchically refined with front to back ordering to a specified threshold whilst occlusion culling with occluder fusion. Visible nodes at medium levels of detail are refined further to rasterization scales. Occlusion culling defines a set of visible nodes that can support caching for temporal coherence. Occlusion culling is approximate, possibly not suiting critical applications. Qualities and performance are tested against standard rendering. Although the algorithm has a 0(f) upper bound in the scene sizef, it is shown to practically scale sub-linearly. Scenes with several hundred billion polygons conventionally, are rendered at interactive frame rates with minimal graphics hardware support

    Scalable exploration of 3D massive models

    Get PDF
    Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01[Resumo] Esta tese presenta unha serie técnicas escalables que avanzan o estado da arte da creación e exploración de grandes modelos tridimensionaies. No ámbito da xeración destes modelos, preséntanse métodos para mellorar a adquisición e procesado de escenas reais, grazas a unha implementación eficiente dun sistema out- of- core de xestión de nubes de puntos, e unha nova metodoloxía escalable de fusión de datos de xeometría e cor para adquisicións con oclusións. No ámbito da visualización de grandes conxuntos de datos, que é o núcleo principal desta tese, preséntanse dous novos métodos. O primeiro é unha técnica adaptabile out-of-core que aproveita o hardware de rasterización da GPU e as occlusion queries para crear lotes coherentes de traballo, que serán procesados por kernels de trazado de raios codificados en shaders, permitindo out-of-core ray-tracing con sombreado e iluminación global. O segundo é un método de compresión agresivo que aproveita a redundancia xeométrica que se adoita atopar en grandes modelos 3D para comprimir os datos de forma que caiban, nun formato totalmente renderizable, na memoria da GPU. O método está deseñado para representacións voxelizadas de escenas 3D, que son amplamente utilizadas para diversos cálculos como para acelerar as consultas de visibilidade na GPU. A compresión lógrase fusionando subárbores idénticas a través dunha transformación de similitude, e aproveitando a distribución non homoxénea de referencias a nodos compartidos para almacenar punteiros aos nodos fillo, e utilizando unha codificación de bits variable. A capacidade e o rendemento de todos os métodos avalíanse utilizando diversos casos de uso do mundo real de diversos ámbitos e sectores, incluídos o patrimonio cultural, a enxeñería e os videoxogos.[Resumen] En esta tesis se presentan una serie técnicas escalables que avanzan el estado del arte de la creación y exploración de grandes modelos tridimensionales. En el ámbito de la generación de estos modelos, se presentan métodos para mejorar la adquisición y procesado de escenas reales, gracias a una implementación eficiente de un sistema out-of-core de gestión de nubes de puntos, y una nueva metodología escalable de fusión de datos de geometría y color para adquisiciones con oclusiones. Para la visualización de grandes conjuntos de datos, que constituye el núcleo principal de esta tesis, se presentan dos nuevos métodos. El primero de ellos es una técnica adaptable out-of-core que aprovecha el hardware de rasterización de la GPU y las occlusion queries, para crear lotes coherentes de trabajo, que serán procesados por kernels de trazado de rayos codificados en shaders, permitiendo renders out-of-core avanzados con sombreado e iluminación global. El segundo es un método de compresión agresivo, que aprovecha la redundancia geométrica que se suele encontrar en grandes modelos 3D para comprimir los datos de forma que quepan, en un formato totalmente renderizable, en la memoria de la GPU. El método está diseñado para representaciones voxelizadas de escenas 3D, que son ampliamente utilizadas para diversos cálculos como la aceleración las consultas de visibilidad en la GPU o el trazado de sombras. La compresión se logra fusionando subárboles idénticos a través de una transformación de similitud, y aprovechando la distribución no homogénea de referencias a nodos compartidos para almacenar punteros a los nodos hijo, utilizando una codificación de bits variable. La capacidad y el rendimiento de todos los métodos se evalúan utilizando diversos casos de uso del mundo real de diversos ámbitos y sectores, incluidos el patrimonio cultural, la ingeniería y los videojuegos.[Abstract] This thesis introduces scalable techniques that advance the state-of-the-art in massive model creation and exploration. Concerning model creation, we present methods for improving reality-based scene acquisition and processing, introducing an efficient implementation of scalable out-of-core point clouds and a data-fusion approach for creating detailed colored models from cluttered scene acquisitions. The core of this thesis concerns enabling technology for the exploration of general large datasets. Two novel solutions are introduced. The first is an adaptive out-of-core technique exploiting the GPU rasterization pipeline and hardware occlusion queries in order to create coherent batches of work for localized shader-based ray tracing kernels, opening the door to out-of-core ray tracing with shadowing and global illumination. The second is an aggressive compression method that exploits redundancy in large models to compress data so that it fits, in fully renderable format, in GPU memory. The method is targeted to voxelized representations of 3D scenes, which are widely used to accelerate visibility queries on the GPU. Compression is achieved by merging subtrees that are identical through a similarity transform and by exploiting the skewed distribution of references to shared nodes to store child pointers using a variable bitrate encoding The capability and performance of all methods are evaluated on many very massive real-world scenes from several domains, including cultural heritage, engineering, and gaming

    NeRF: Neural Radiance Field in 3D Vision, A Comprehensive Review

    Full text link
    Neural Radiance Field (NeRF), a new novel view synthesis with implicit scene representation has taken the field of Computer Vision by storm. As a novel view synthesis and 3D reconstruction method, NeRF models find applications in robotics, urban mapping, autonomous navigation, virtual reality/augmented reality, and more. Since the original paper by Mildenhall et al., more than 250 preprints were published, with more than 100 eventually being accepted in tier one Computer Vision Conferences. Given NeRF popularity and the current interest in this research area, we believe it necessary to compile a comprehensive survey of NeRF papers from the past two years, which we organized into both architecture, and application based taxonomies. We also provide an introduction to the theory of NeRF based novel view synthesis, and a benchmark comparison of the performance and speed of key NeRF models. By creating this survey, we hope to introduce new researchers to NeRF, provide a helpful reference for influential works in this field, as well as motivate future research directions with our discussion section
    corecore