9,309 research outputs found
Fast multi-view video plus depth coding with hierarchical bi-prediction
This research work is partially funded by STEPS-Malta and partially by the EU–ESF 1.25.The Multi-view Video Coding (MVC) standard was developed for efficient encoding of multi-view videos. Part of it requires the calculation of both disparity and motion estimations using a bi-prediction structure. These estimations involve an exhaustive search for the optimal compensation vectors from multiple forward and backward reference frames which, while being very efficient in terms of compression, results in high computational costs. This paper proposes a solution that utilizes the multi-view geometry along with the available depth data, to calculate more accurate predictors for both motion and disparity estimations, and for both directions of the prediction structure. Simulation results demonstrate that this technique is reliable enough to allow a substantial reduction in the search areas in all the reference frames. This in turn results in a significant speed-up gain of 3.2 times with a negligible influence on the coding efficiency, while encoding both the color and the depth MVVs.peer-reviewe
3D video coding and transmission
The capture, transmission, and display of
3D content has gained a lot of attention in the last few
years. 3D multimedia content is no longer con fined to
cinema theatres but is being transmitted using stereoscopic
video over satellite, shared on Blu-RayTMdisks,
or sent over Internet technologies. Stereoscopic displays
are needed at the receiving end and the viewer needs to
wear special glasses to present the two versions of the
video to the human vision system that then generates
the 3D illusion. To be more e ffective and improve the
immersive experience, more views are acquired from a
larger number of cameras and presented on di fferent displays,
such as autostereoscopic and light field displays.
These multiple views, combined with depth data, also
allow enhanced user experiences and new forms of interaction
with the 3D content from virtual viewpoints.
This type of audiovisual information is represented by a
huge amount of data that needs to be compressed and
transmitted over bandwidth-limited channels. Part of
the COST Action IC1105 \3D Content Creation, Coding
and Transmission over Future Media Networks" (3DConTourNet)
focuses on this research challenge.peer-reviewe
Improved depth maps coding efficiency of 3D videos
The research work disclosed in this publication is partially funded by the Strategic Educational Pathways Scholarship Scheme (Malta). The scholarship is part-financed by the
European Union – European Social Fund.Immersive 3D video services demand the transmission of the viewpoints' depth map together with the texture multiview video to allow arbitrary reconstruction of intermediate viewpoints required for free-view navigation and 3D depth perception. The Multi-view Video Coding (MVC) standard is generally used to encode these auxiliary depth maps and since their estimation process is highly computational intensive, the coding time increases. This paper proposes a technique that exploits the multi-view geometry together with the depth map itself to calculate more accurate initial compensation vectors for the Macro-blocks' estimation. Starting from a more accurate position allows for a smaller search area, reducing the computations required during depth map MVC. Furthermore, the SKIP mode is extended to predict also the disparity vectors from the neighborhood encoded vectors, to omit some of them from transmission. Results demonstrate that these modifications provide an average computational reduction of up-to 87% with a bitrate saving of about 8.3% while encoding an inter-view predicted viewpoint from a depth map multi-view video.peer-reviewe
Overview of 3D Video: Coding Algorithms, Implementations and Standardization
Projecte final de carrera fet en col.laboració amb Linköping Institute of TechnologyEnglish: 3D technologies have aroused a great interest over the world in the last years. Television, cinema and videogames are introducing, little by little, 3D technologies into the mass market. This comes as a result of the research done in the 3D field, solving many of its limitations such as quality, contents creation or 3D displays. This thesis focus on 3D video, considering concepts that concerns the coding issues and the video formats. The aim is to provide an overview of the current state of 3D video, including the standardization and some interesting implementations and alternatives that exist. In the report necessary background information is presented in order to understand the concepts developed: compression techniques, the different video formats, their standardization and some advances or alternatives to the processes previously explained. Finally, a comparison between the different concepts is presented to complete the overview, ending with some conclusions and proposed ideas for future works.Castellano: Las tecnologÃas 3D han despertado un gran interés en todo el mundo en los últimos años. Televisión, cine y videojuegos están introduciendo, poco a poco, ésta tecnologÃa en el mercado. Esto es resultado de la investigación realizada en el campo de las 3D, solucionando muchas de sus limitaciones, como la calidad, la creación de contenidos o las pantallas 3D. Este proyecto se centra en el video 3D, considerando los conceptos relacionados con la codificación y los formatos de vÃdeo. El objetivo es proporcionar una visión del estado actual del vÃdeo 3D, incluyendo los estándares y algunas de las implementaciones más interesantes que existen. En la memoria, se presenta información adicional para facilitar el seguimiento de los conceptos desarrollados: técnicas de compresión, formatos de vÃdeo, su estandarización y algunos avances o alternativas a los procesos explicados. Finalmente, se presentan diferentes comparaciones entre los conceptos tratados, acabando el documento con las conclusiones obtenidas e ideas propuestas para futuros trabajos.Català : Les tecnologies 3D han despertat un gran interès a tot el món en els últims anys. Televisió, cinema i videojocs estan introduint, lentament, aquesta tecnologia en el mercat. Això és resultat de la investigació portada a terme en el camp de les 3D, solucionant moltes de les seves limitacions, com la qualitat, la creació de continguts o les pantalles 3D. Aquest proyecte es centra en el video 3D, considerant els conceptes relacionats amb la codificació i els formats de video. L'objectiu és proporcionar una visió de l'estat actual del video 3D, incloent-hi els estandà rds i algunes de les implementacions més interessants que existeixen. A la memòria, es presenta informació adicional per facilitar el seguiment dels conceptes desenvolupats: tècniques de compressió, formats de video, la seva estandardització i alguns avenços o alternatives als procesos explicats. Finalment, es presenten diferents comparacions entre els conceptes tractats i les conclusions obtingudes, juntament amb propostes per a futurs treballs
High Performance Multiview Video Coding
Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications
Livrable D4.2 of the PERSEE project : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architecture
51Livrable D4.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D4.2 du projet. Son titre : Représentation et codage 3D - Rapport intermédiaire - Définitions des softs et architectur
Contributions in image and video coding
Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vÃdeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vÃdeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vÃdeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vÃdeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saÃda totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vÃdeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vÃdeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuÃdos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte tÃpicas de vÃdeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e hÃbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os nÃveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os nÃveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric
Dense light field coding: a survey
Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems.
Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio
Source coding for transmission of reconstructed dynamic geometry: a rate-distortion-complexity analysis of different approaches
Live 3D reconstruction of a human as a 3D mesh with commodity electronics is becoming a reality. Immersive applications (i.e. cloud gaming, tele-presence) benefit from effective transmission of such content over a bandwidth limited link. In this paper we outline different approaches for compressing live reconstructed mesh geometry based on distributing mesh reconstruction functions between sender and receiver. We evaluate rate-performance-complexity of different configurations. First, we investigate 3D mesh compression methods (i.e. dynamic/static) from MPEG-4. Second, we evaluate the option of using octree based point cloud compression and receiver side surface reconstruction
- …