186 research outputs found

    Intra Coding Strategy for Video Error Resiliency: Behavioral Analysis

    Get PDF
    One challenge in video transmission is to deal with packet loss. Since the compressed video streams are sensitive to data loss, the error resiliency of the encoded video becomes important. When video data is lost and retransmission is not possible, the missed data should be concealed. But loss concealment causes distortion in the lossy frame which also propagates into the next frames even if their data are received correctly. One promising solution to mitigate this error propagation is intra coding. There are three approaches for intra coding: intra coding of a number of blocks selected randomly or regularly, intra coding of some specific blocks selected by an appropriate cost function, or intra coding of a whole frame. But Intra coding reduces the compression ratio; therefore, there exists a trade-off between bitrate and error resiliency achieved by intra coding. In this paper, we study and show the best strategy for getting the best rate-distortion performance. Considering the error propagation, an objective function is formulated, and with some approximations, this objective function is simplified and solved. The solution demonstrates that periodical I-frame coding is preferred over coding only a number of blocks as intra mode in P-frames. Through examination of various test sequences, it is shown that the best intra frame period depends on the coding bitrate as well as the packet loss rate. We then propose a scheme to estimate this period from curve fitting of the experimental results, and show that our proposed scheme outperforms other methods of intra coding especially for higher loss rates and coding bitrates

    Algorithms and Hardware Co-Design of HEVC Intra Encoders

    Get PDF
    Digital video is becoming extremely important nowadays and its importance has greatly increased in the last two decades. Due to the rapid development of information and communication technologies, the demand for Ultra-High Definition (UHD) video applications is becoming stronger. However, the most prevalent video compression standard H.264/AVC released in 2003 is inefficient when it comes to UHD videos. The increasing desire for superior compression efficiency to H.264/AVC leads to the standardization of High Efficiency Video Coding (HEVC). Compared with the H.264/AVC standard, HEVC offers a double compression ratio at the same level of video quality or substantial improvement of video quality at the same video bitrate. Yet, HE-VC/H.265 possesses superior compression efficiency, its complexity is several times more than H.264/AVC, impeding its high throughput implementation. Currently, most of the researchers have focused merely on algorithm level adaptations of HEVC/H.265 standard to reduce computational intensity without considering the hardware feasibility. Whatā€™s more, the exploration of efficient hardware architecture design is not exhaustive. Only a few research works have been conducted to explore efficient hardware architectures of HEVC/H.265 standard. In this dissertation, we investigate efficient algorithm adaptations and hardware architecture design of HEVC intra encoders. We also explore the deep learning approach in mode prediction. From the algorithm point of view, we propose three efficient hardware-oriented algorithm adaptations, including mode reduction, fast coding unit (CU) cost estimation, and group-based CABAC (context-adaptive binary arithmetic coding) rate estimation. Mode reduction aims to reduce mode candidates of each prediction unit (PU) in the rate-distortion optimization (RDO) process, which is both computation-intensive and time-consuming. Fast CU cost estimation is applied to reduce the complexity in rate-distortion (RD) calculation of each CU. Group-based CABAC rate estimation is proposed to parallelize syntax elements processing to greatly improve rate estimation throughput. From the hardware design perspective, a fully parallel hardware architecture of HEVC intra encoder is developed to sustain UHD video compression at 4K@30fps. The fully parallel architecture introduces four prediction engines (PE) and each PE performs the full cycle of mode prediction, transform, quantization, inverse quantization, inverse transform, reconstruction, rate-distortion estimation independently. PU blocks with different PU sizes will be processed by the different prediction engines (PE) simultaneously. Also, an efficient hardware implementation of a group-based CABAC rate estimator is incorporated into the proposed HEVC intra encoder for accurate and high-throughput rate estimation. To take advantage of the deep learning approach, we also propose a fully connected layer based neural network (FCLNN) mode preselection scheme to reduce the number of RDO modes of luma prediction blocks. All angular prediction modes are classified into 7 prediction groups. Each group contains 3-5 prediction modes that exhibit a similar prediction angle. A rough angle detection algorithm is designed to determine the prediction direction of the current block, then a small scale FCLNN is exploited to refine the mode prediction

    ę·±å±¤å­¦ēæ’恫åŸŗć„ćē”»åƒåœ§ēø®ćØ品č³Ŗč©•ä¾”

    Get PDF
    ę—©å¤§å­¦ä½čؘē•Ŗ号:ꖰ8427ę—©ēزē”°å¤§

    Towards Hybrid-Optimization Video Coding

    Full text link
    Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimization solutions. Block-based hybrid coding represents the discrete optimization solution because those irrelevant coding modes are discrete in mathematics. It searches for the best one among multiple starting points (i.e. modes). However, the search is not efficient enough. On the other hand, end-to-end learned coding represents the continuous optimization solution because the gradient descent is based on a continuous function. It optimizes a group of model parameters efficiently by the numerical algorithm. However, limited by only one starting point, it is easy to fall into the local optimum. To better solve the optimization problem, we propose to regard video coding as a hybrid of the discrete and continuous optimization problem, and use both search and numerical algorithm to solve it. Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently. Finally, we search for the global optimum among those local optimums. Guided by the hybrid optimization idea, we design a hybrid optimization video coding framework, which is built on continuous deep networks entirely and also contains some discrete modes. We conduct a comprehensive set of experiments. Compared to the continuous optimization framework, our method outperforms pure learned video coding methods. Meanwhile, compared to the discrete optimization framework, our method achieves comparable performance to HEVC reference software HM16.10 in PSNR

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Analysis of the perceptual quality performance of different HEVC coding tools

    Get PDF
    Each new video encoding standard includes encoding techniques that aim to improve the performance and quality of the previous standards. During the development of these techniques, PSNR was used as the main distortion metric. However, the PSNR metric does not consider the subjectivity of the human visual system, so that the performance of some coding tools is questionable from the perceptual point of view. To further explore this point, we have developed a detailed study about the perceptual sensibility of different HEVC video coding tools. In order to perform this study, we used some popular objective quality assessment metrics to measure the perceptual response of every single coding tool. The conclusion of this work will help to determine the set of HEVC coding tools that provides, in general, the best perceptual response

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicaƧƵes de vĆ­deo, o papel do vĆ­deo decodificado Ć© muito mais que simplesmente preencher uma tela para visualizaĆ§Ć£o. Para oferecer aplicaƧƵes mais poderosas por meio de sinais de vĆ­deo,Ć© cada vez mais crĆ­tico nĆ£o apenas considerar a qualidade do conteĆŗdo objetivando sua visualizaĆ§Ć£o, mas tambĆ©m possibilitar meios de realizar busca por conteĆŗdos semelhantes. Requisitos de visualizaĆ§Ć£o e de busca sĆ£o considerados, por exemplo, em modernas aplicaƧƵes de vĆ­deo vigilĆ¢ncia e comunicaƧƵes pessoais. No entanto, as atuais soluƧƵes de codificaĆ§Ć£o de vĆ­deo sĆ£o fortemente voltadas aos requisitos de visualizaĆ§Ć£o. Nesse contexto, o objetivo deste trabalho Ć© propor uma soluĆ§Ć£o de codificaĆ§Ć£o de vĆ­deo de propĆ³sito duplo, objetivando tanto requisitos de visualizaĆ§Ć£o quanto de busca. Para isso, Ć© proposto um arcabouƧo de codificaĆ§Ć£o em que a abordagem usual de codificaĆ§Ć£o de pixels Ć© combinada com uma nova abordagem de codificaĆ§Ć£o baseada em features visuais. Nessa soluĆ§Ć£o, alguns quadros sĆ£o codificados usando um conjunto de pares de keypoints casados, possibilitando nĆ£o apenas visualizaĆ§Ć£o, mas tambĆ©m provendo ao decodificador valiosas informaƧƵes de features visuais, extraĆ­das no codificador a partir do conteĆŗdo original, que sĆ£o instrumentais em aplicaƧƵes de busca. A soluĆ§Ć£o proposta emprega um esquema flexĆ­vel de otimizaĆ§Ć£o Lagrangiana onde o processamento baseado em pixel Ć© combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualizaĆ§Ć£o e de busca. Os resultados experimentais mostram a flexibilidade da soluĆ§Ć£o proposta em alcanƧar diferentes compromissos de otimizaĆ§Ć£o, nomeadamente desempenho competitivo em relaĆ§Ć£o ao padrĆ£o HEVC tanto em termos de visualizaĆ§Ć£o quanto de busca
    • ā€¦
    corecore