211 research outputs found

    Compression and Subjective Quality Assessment of 3D Video

    Get PDF
    In recent years, three-dimensional television (3D TV) has been broadly considered as the successor to the existing traditional two-dimensional television (2D TV) sets. With its capability of offering a dynamic and immersive experience, 3D video (3DV) is expected to expand conventional video in several applications in the near future. However, 3D content requires more than a single view to deliver the depth sensation to the viewers and this, inevitably, increases the bitrate compared to the corresponding 2D content. This need drives the research trend in video compression field towards more advanced and more efficient algorithms. Currently, the Advanced Video Coding (H.264/AVC) is the state-of-the-art video coding standard which has been developed by the Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. This codec has been widely adopted in various applications and products such as TV broadcasting, video conferencing, mobile TV, and blue-ray disc. One important extension of H.264/AVC, namely Multiview Video Coding (MVC) was an attempt to multiple view compression by taking into consideration the inter-view dependency between different views of the same scene. This codec H.264/AVC with its MVC extension (H.264/MVC) can be used for encoding either conventional stereoscopic video, including only two views, or multiview video, including more than two views. In spite of the high performance of H.264/MVC, a typical multiview video sequence requires a huge amount of storage space, which is proportional to the number of offered views. The available views are still limited and the research has been devoted to synthesizing an arbitrary number of views using the multiview video and depth map (MVD). This process is mandatory for auto-stereoscopic displays (ASDs) where many views are required at the viewer side and there is no way to transmit such a relatively huge number of views with currently available broadcasting technology. Therefore, to satisfy the growing hunger for 3D related applications, it is mandatory to further decrease the bitstream by introducing new and more efficient algorithms for compressing multiview video and depth maps. This thesis tackles the 3D content compression targeting different formats i.e. stereoscopic video and depth-enhanced multiview video. Stereoscopic video compression algorithms introduced in this thesis mostly focus on proposing different types of asymmetry between the left and right views. This means reducing the quality of one view compared to the other view aiming to achieve a better subjective quality against the symmetric case (the reference) and under the same bitrate constraint. The proposed algorithms to optimize depth-enhanced multiview video compression include both texture compression schemes as well as depth map coding tools. Some of the introduced coding schemes proposed for this format include asymmetric quality between the views. Knowing that objective metrics are not able to accurately estimate the subjective quality of stereoscopic content, it is suggested to perform subjective quality assessment to evaluate different codecs. Moreover, when the concept of asymmetry is introduced, the Human Visual System (HVS) performs a fusion process which is not completely understood. Therefore, another important aspect of this thesis is conducting several subjective tests and reporting the subjective ratings to evaluate the perceived quality of the proposed coded content against the references. Statistical analysis is carried out in the thesis to assess the validity of the subjective ratings and determine the best performing test cases

    Distributed Video Coding for Multiview and Video-plus-depth Coding

    Get PDF

    Depth-based Multi-View 3D Video Coding

    Get PDF

    Error resilience and concealment techniques for high-efficiency video coding

    Get PDF
    This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the streaming stage, a prioritization algorithm, based on spatial dependencies, selects a reduced set of motion vectors to be transmitted, as side information, to reduce mismatched motion predictions at the decoder. The problem of error concealment-aware video coding is also investigated to enhance the overall error robustness. A new approach based on scalable coding and optimally error concealment selection is proposed, where the optimal error concealment modes are found by simulating transmission losses, followed by a saliency-weighted optimisation. Moreover, recovery residual information is encoded using a rate-controlled enhancement layer. Both are transmitted to the decoder to be used in case of data loss. Finally, an adaptive error resilience scheme is proposed to dynamically predict the video stream that achieves the highest decoded quality for a particular loss case. A neural network selects among the various video streams, encoded with different levels of compression efficiency and error protection, based on information from the video signal, the coded stream and the transmission network. Overall, the new robust video coding methods investigated in this thesis yield consistent quality gains in comparison with other existing methods and also the ones implemented in the HEVC reference software. Furthermore, the trade-off between coding efficiency and error robustness is also better in the proposed methods

    Semantics-aware content delivery framework for 3D Tele-immersion

    Get PDF
    3D Tele-immersion (3DTI) technology allows full-body, multimodal interaction among geographically dispersed users, which opens a variety of possibilities in cyber collaborative applications such as art performance, exergaming, and physical rehabilitation. However, with its great potential, the resource and quality demands of 3DTI rise inevitably, especially when some advanced applications target resource-limited computing environments with stringent scalability demands. Under these circumstances, the tradeoffs between 1) resource requirements, 2) content complexity, and 3) user satisfaction in delivery of 3DTI services are magnified. In this dissertation, we argue that these tradeoffs of 3DTI systems are actually avoidable when the underlying delivery framework of 3DTI takes the semantic information into consideration. We introduce the concept of semantic information into 3DTI, which encompasses information about the three factors: environment, activity, and user role in 3DTI applications. With semantic information, 3DTI systems are able to 1) identify the characteristics of its computing environment to allocate computing power and bandwidth to delivery of prioritized contents, 2) pinpoint and discard the dispensable content in activity capturing according to properties of target application, and 3) differentiate contents by their contributions on fulfilling the objectives and expectation of user’s role in the application so that the adaptation module can allocate resource budget accordingly. With these capabilities we can change the tradeoffs into synergy between resource requirements, content complexity, and user satisfaction. We implement semantics-aware 3DTI systems to verify the performance gain on the three phases in 3DTI systems’ delivery chain: capturing phase, dissemination phase, and receiving phase. By introducing semantics information to distinct 3DTI systems, the efficiency improvements brought by our semantics-aware content delivery framework are validated under different application requirements, different scalability bottlenecks, and different user and application models. To sum up, in this dissertation we aim to change the tradeoff between requirements, complexity, and satisfaction in 3DTI services by exploiting the semantic information about the computing environment, the activity, and the user role upon the underlying delivery systems of 3DTI. The devised mechanisms will enhance the efficiency of 3DTI systems targeting on serving different purposes and 3DTI applications with different computation and scalability requirements

    Algoritmo de estimação de movimento e sua arquitetura de hardware para HEVC

    Get PDF
    Doutoramento em Engenharia EletrotécnicaVideo coding has been used in applications like video surveillance, video conferencing, video streaming, video broadcasting and video storage. In a typical video coding standard, many algorithms are combined to compress a video. However, one of those algorithms, the motion estimation is the most complex task. Hence, it is necessary to implement this task in real time by using appropriate VLSI architectures. This thesis proposes a new fast motion estimation algorithm and its implementation in real time. The results show that the proposed algorithm and its motion estimation hardware architecture out performs the state of the art. The proposed architecture operates at a maximum operating frequency of 241.6 MHz and is able to process 1080p@60Hz with all possible variables block sizes specified in HEVC standard as well as with motion vector search range of up to ±64 pixels.A codificação de vídeo tem sido usada em aplicações tais como, vídeovigilância, vídeo-conferência, video streaming e armazenamento de vídeo. Numa norma de codificação de vídeo, diversos algoritmos são combinados para comprimir o vídeo. Contudo, um desses algoritmos, a estimação de movimento é a tarefa mais complexa. Por isso, é necessário implementar esta tarefa em tempo real usando arquiteturas de hardware apropriadas. Esta tese propõe um algoritmo de estimação de movimento rápido bem como a sua implementação em tempo real. Os resultados mostram que o algoritmo e a arquitetura de hardware propostos têm melhor desempenho que os existentes. A arquitetura proposta opera a uma frequência máxima de 241.6 MHz e é capaz de processar imagens de resolução 1080p@60Hz, com todos os tamanhos de blocos especificados na norma HEVC, bem como um domínio de pesquisa de vetores de movimento até ±64 pixels

    VLSI architectures design for encoders of High Efficiency Video Coding (HEVC) standard

    Get PDF
    The growing popularity of high resolution video and the continuously increasing demands for high quality video on mobile devices are producing stronger needs for more efficient video encoder. Concerning these desires, HEVC, a newest video coding standard, has been developed by a joint team formed by ISO/IEO MPEG and ITU/T VCEG. Its design goal is to achieve a 50% compression gain over its predecessor H.264 with an equal or even higher perceptual video quality. Motion Estimation (ME) being as one of the most critical module in video coding contributes almost 50%-70% of computational complexity in the video encoder. This high consumption of the computational resources puts a limit on the performance of encoders, especially for full HD or ultra HD videos, in terms of coding speed, bit-rate and video quality. Thus the major part of this work concentrates on the computational complexity reduction and improvement of timing performance of motion estimation algorithms for HEVC standard. First, a new strategy to calculate the SAD (Sum of Absolute Difference) for motion estimation is designed based on the statistics on property of pixel data of video sequences. This statistics demonstrates the size relationship between the sum of two sets of pixels has a determined connection with the distribution of the size relationship between individual pixels from the two sets. Taking the advantage of this observation, only a small proportion of pixels is necessary to be involved in the SAD calculation. Simulations show that the amount of computations required in the full search algorithm is reduced by about 58% on average and up to 70% in the best case. Secondly, from the scope of parallelization an enhanced TZ search for HEVC is proposed using novel schemes of multiple MVPs (motion vector predictor) and shared MVP. Specifically, resorting to multiple MVPs the initial search process is performed in parallel at multiple search centers, and the ME processing engine for PUs within one CU are parallelized based on the MVP sharing scheme on CU (coding unit) level. Moreover, the SAD module for ME engine is also parallelly implemented for PU size of 32×32. Experiments indicate it achieves an appreciable improvement on the throughput and coding efficiency of the HEVC video encoder. In addition, the other part of this thesis is contributed to the VLSI architecture design for finding the first W maximum/minimum values targeting towards high speed and low hardware cost. The architecture based on the novel bit-wise AND scheme has only half of the area of the best reference solution and its critical path delay is comparable with other implementations. While the FPCG (full parallel comparison grid) architecture, which utilizes the optimized comparator-based structure, achieves 3.6 times faster on average on the speed and even 5.2 times faster at best comparing with the reference architectures. Finally the architecture using the partial sorting strategy reaches a good balance on the timing performance and area, which has a slightly lower or comparable speed with FPCG architecture and a acceptable hardware cost

    Network reputation-based quality optimization of video delivery in heterogeneous wireless environments

    Get PDF
    The mass-market adoption of high-end mobile devices and increasing amount of video traffic has led the mobile operators to adopt various solutions to help them cope with the explosion of mobile broadband data traffic, while ensuring high Quality of Service (QoS) levels to their services. Deploying small-cell base stations within the existing macro-cellular networks and offloading traffic from the large macro-cells to the small cells is seen as a promising solution to increase capacity and improve network performance at low cost. Parallel use of diverse technologies is also employed. The result is a heterogeneous network environment (HetNets), part of the next generation network deployments. In this context, this thesis makes a step forward towards the “Always Best Experience” paradigm, which considers mobile users seamlessly roaming in the HetNets environment. Supporting ubiquitous connectivity and enabling very good quality of rich mobile services anywhere and anytime is highly challenging, mostly due to the heterogeneity of the selection criteria, such as: application requirements (e.g., voice, video, data, etc.); different device types and with various capabilities (e.g., smartphones, netbooks, laptops, etc.); multiple overlapping networks using diverse technologies (e.g., Wireless Local Area Networks (IEEE 802.11), Cellular Networks Long Term Evolution (LTE), etc.) and different user preferences. In fact, the mobile users are facing a complex decision when they need to dynamically select the best value network to connect to in order to get the “Always Best Experience”. This thesis presents three major contributions to solve the problem described above: 1) The Location-based Network Prediction mechanism in heterogeneous wireless networks (LNP) provides a shortlist of best available networks to the mobile user based on his location, history record and routing plan; 2) Reputation-oriented Access Network Selection mechanism (RANS) selects the best reputation network from the available networks for the mobile user based on the best trade-off between QoS, energy consumptions and monetary cost. The network reputation is defined based on previous user-network interaction, and consequent user experience with the network. 3) Network Reputation-based Quality Optimization of Video Delivery in heterogeneous networks (NRQOVD) makes use of a reputation mechanism to enhance the video content quality via multipath delivery or delivery adaptation
    corecore