122 research outputs found

    Historical information aware unequal error protection of scalable HEVC/H.265 streaming over free space optical channels

    No full text
    Free space optical (FSO) systems are capable of supporting high data rates between fixed points in the context of flawless video communications. Layered video coding facilitates the creation of different-resolution subset layers for variablethroughput transmission scenarios. In this paper, we propose Historical information Aware Unequal Error Protection (HAUEP) for the scalable high efficiency video codec (SHVC) used for streaming over FSO channels. Specifically, the objective function (OF) of the current video frame is designed based on historical information of its dependent frames. By optimizing this OF, specific subset layers may be selected in conjunction with carefully selected forward error correction (FEC) coding rates, where the expected video distortion is minimized and the required bitrate is reduced under the constraint of a specific throughput. Our simulation results show that the proposed system outperforms the traditional equal error protection (EEP) scheme by about 4.5 dB of Eb=N0 at a peak signal-to-noise ratio (PSNR) of 33 dB. From a throughput-oriented perspective, HA-UEP is capable of reducing the throughput to about 30% compared to that of the EEP benchmarker, while achieving an Eb=N0 gain of 4.5 dB

    Algorithms and Hardware Co-Design of HEVC Intra Encoders

    Get PDF
    Digital video is becoming extremely important nowadays and its importance has greatly increased in the last two decades. Due to the rapid development of information and communication technologies, the demand for Ultra-High Definition (UHD) video applications is becoming stronger. However, the most prevalent video compression standard H.264/AVC released in 2003 is inefficient when it comes to UHD videos. The increasing desire for superior compression efficiency to H.264/AVC leads to the standardization of High Efficiency Video Coding (HEVC). Compared with the H.264/AVC standard, HEVC offers a double compression ratio at the same level of video quality or substantial improvement of video quality at the same video bitrate. Yet, HE-VC/H.265 possesses superior compression efficiency, its complexity is several times more than H.264/AVC, impeding its high throughput implementation. Currently, most of the researchers have focused merely on algorithm level adaptations of HEVC/H.265 standard to reduce computational intensity without considering the hardware feasibility. What’s more, the exploration of efficient hardware architecture design is not exhaustive. Only a few research works have been conducted to explore efficient hardware architectures of HEVC/H.265 standard. In this dissertation, we investigate efficient algorithm adaptations and hardware architecture design of HEVC intra encoders. We also explore the deep learning approach in mode prediction. From the algorithm point of view, we propose three efficient hardware-oriented algorithm adaptations, including mode reduction, fast coding unit (CU) cost estimation, and group-based CABAC (context-adaptive binary arithmetic coding) rate estimation. Mode reduction aims to reduce mode candidates of each prediction unit (PU) in the rate-distortion optimization (RDO) process, which is both computation-intensive and time-consuming. Fast CU cost estimation is applied to reduce the complexity in rate-distortion (RD) calculation of each CU. Group-based CABAC rate estimation is proposed to parallelize syntax elements processing to greatly improve rate estimation throughput. From the hardware design perspective, a fully parallel hardware architecture of HEVC intra encoder is developed to sustain UHD video compression at 4K@30fps. The fully parallel architecture introduces four prediction engines (PE) and each PE performs the full cycle of mode prediction, transform, quantization, inverse quantization, inverse transform, reconstruction, rate-distortion estimation independently. PU blocks with different PU sizes will be processed by the different prediction engines (PE) simultaneously. Also, an efficient hardware implementation of a group-based CABAC rate estimator is incorporated into the proposed HEVC intra encoder for accurate and high-throughput rate estimation. To take advantage of the deep learning approach, we also propose a fully connected layer based neural network (FCLNN) mode preselection scheme to reduce the number of RDO modes of luma prediction blocks. All angular prediction modes are classified into 7 prediction groups. Each group contains 3-5 prediction modes that exhibit a similar prediction angle. A rough angle detection algorithm is designed to determine the prediction direction of the current block, then a small scale FCLNN is exploited to refine the mode prediction

    Depth sequence coding with hierarchical partitioning and spatial-domain quantization

    Get PDF
    Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE

    Efficient Coding of Transform Coefficient Levels in Hybrid Video Coding

    Get PDF
    All video coding standards of practical importance, such as Advanced Video Coding (AVC), its successor High Efficiency Video Coding (HEVC), and the state-of-the-art Versatile Video Coding (VVC), follow the basic principle of block-based hybrid video coding. In such an architecture, the video pictures are partitioned into blocks. Each block is first predicted by either intra-picture or motion-compensated prediction, and the resulting prediction errors, referred to as residuals, are compressed using transform coding. This thesis deals with the entropy coding of quantization indices for transform coefficients, also referred to as transform coefficient levels, as well as the entropy coding of directly quantized residual samples. The entropy coding of quantization indices is referred to as level coding in this thesis. The presented developments focus on both improving the coding efficiency and reducing the complexity of the level coding for HEVC and VVC. These goals were achieved by modifying the context modeling and the binarization of the level coding. The first development presented in this thesis is a transform coefficient level coding for variable transform block sizes, which was introduced in HEVC. It exploits the fact that non-zero levels are typically concentrated in certain parts of the transform block by partitioning blocks larger than \square{4} samples into \square{4} sub-blocks. Each \square{4} sub-block is then coded similarly to the level coding specified in AVC for \square{4} transform blocks. This sub-block processing improves coding efficiency and has the advantage that the number of required context models is independent of the set of supported transform block sizes. The maximum number of context-coded bins for a transform coefficient level is one indicator for the complexity of the entropy coding. An adaptive binarization of absolute transform coefficient levels using Rice codes is presented that reduces the maximum number of context-coded bins from 15 (as used in AVC) to three for HEVC. Based on the developed selection of an appropriate Rice code for each scanning position, this adaptive binarization achieves virtually the same coding efficiency as the binarization specified in AVC for bit-rate operation points typically used in consumer applications. The coding efficiency is improved for high bit-rate operation points, which are used in more advanced and professional applications. In order to further improve the coding efficiency for HEVC and VVC, the statistical dependencies among the transform coefficient levels of a transform block are exploited by a template-based context modeling developed in this thesis. Instead of selecting the context model for a current scanning position primarily based on its location inside a transform block, already coded neighboring locations inside a local template are utilized. To further increase the coding efficiency achieved by the template-based context modeling, the different coding phases of the initially developed level coding are merged into a single coding phase. As a consequence, the template-based context modeling can utilize the absolute levels of the neighboring frequency locations, which provides better conditional probability estimates and further improves coding efficiency. This template-based context modeling with a single coding phase is also suitable for trellis-coded quantization (TCQ), since TCQ is state-driven and derives the next state from the current state and the parity of the current level. TCQ introduces different context model sets for coding the significance flag depending on the current state. Based on statistical analyses, an extension of the state-dependent context modeling of TCQ is presented, which further improves the coding efficiency in VVC. After that, a method to reduce the complexity of the level coding at the decoder is presented. This method separates the level coding into a coding phase exclusively consisting of context-coded bins and another one consisting of bypass-coded bins only. For retaining the state-dependent context selection, which significantly contributes to the coding efficiency of TCQ, a dedicated parity flag is introduced and coded with context models in the first coding phase. An adaptive approach is then presented that further reduces the worst-case complexity, effectively lowering the maximum number of context-coded bins per transform coefficient to 1.75 without negatively affecting the coding efficiency. In the last development presented in this thesis, a dedicated level coding for transform skip blocks, which often occur in screen content applications, is introduced for VVC. This dedicated level coding better exploits the statistical properties of directly quantized residual samples for screen content. Various modifications to the level coding improve the coding efficiency for this type of content. Examples for these modifications are a binarization with additional context-coded flags and the coding of the sign information with adaptive context models

    A comprehensive video codec comparison

    Get PDF
    In this paper, we compare the video codecs AV1 (version 1.0.0-2242 from August 2019), HEVC (HM and x265), AVC (x264), the exploration software JEM which is based on HEVC, and the VVC (successor of HEVC) test model VTM (version 4.0 from February 2019) under two fair and balanced configurations: All Intra for the assessment of intra coding and Maximum Coding Efficiency with all codecs being tuned for their best coding efficiency settings. VTM achieves the highest coding efficiency in both configurations, followed by JEM and AV1. The worst coding efficiency is achieved by x264 and x265, even in the placebo preset for highest coding efficiency. AV1 gained a lot in terms of coding efficiency compared to previous versions and now outperforms HM by 24% BD-Rate gains. VTM gains 5% over AV1 in terms of BD-Rates. By reporting separate numbers for JVET and AOM test sequences, it is ensured that no bias in the test sequences exists. When comparing only intra coding tools, it is observed that the complexity increases exponentially for linearly increasing coding efficiency

    Application-Specific Cache and Prefetching for HEVC CABAC Decoding

    Get PDF
    Context-based Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding module in the HEVC/H.265 video coding standard. As in its predecessor, H.264/AVC, CABAC is a well-known throughput bottleneck due to its strong data dependencies. Besides other optimizations, the replacement of the context model memory by a smaller cache has been proposed for hardware decoders, resulting in an improved clock frequency. However, the effect of potential cache misses has not been properly evaluated. This work fills the gap by performing an extensive evaluation of different cache configurations. Furthermore, it demonstrates that application-specific context model prefetching can effectively reduce the miss rate and increase the overall performance. The best results are achieved with two cache lines consisting of four or eight context models. The 2 × 8 cache allows a performance improvement of 13.2 percent to 16.7 percent compared to a non-cached decoder due to a 17 percent higher clock frequency and highly effective prefetching. The proposed HEVC/H.265 CABAC decoder allows the decoding of high-quality Full HD videos in real-time using few hardware resources on a low-power FPGA.EC/H2020/645500/EU/Improving European VoD Creative Industry with High Efficiency Video Delivery/Film26

    Improved Spectrum Usage with Multi-RF Channel Aggregation Technologies for the Next-Generation Terrestrial Broadcasting

    Full text link
    [EN] Next-generation terrestrial broadcasting targets at enhancing spectral efficiency to overcome the challenges derived from the spectrum shortage as a result of the progressive allocation of frequencies - the so-called Digital Dividend - to satisfy the growing demands for wireless broadband capacity. Advances in both transmission standards and video coding are paramount to enable the progressive roll-out of high video quality services such as HDTV (High Definition Televison) or Ultra HDTV. The transition to the second generation European terrestrial standard DVB-T2 and the introduction of MPEG-4/AVC video coding already enables the transmission of 4-5 HDTV services per RF (Radio Frequency) channel. However, the impossibility to allocate higher bit-rate within the remaining spectrum could jeopardize the evolution of the DTT platforms in favour of other high-capacity systems such as the satellite or cable distribution platforms. Next steps are focused on the deployment of the recently released High Efficiency Video Coding (HEVC) standard, which provides more than 50% coding gain with respect to AVC, with the next-generation terrestrial standards. This could ensure the competitiveness of the DTT. This dissertation addresses the use of multi-RF channel aggregation technologies to increase the spectral efficiency of future DTT networks. The core of the Thesis are two technologies: Time Frequency Slicing (TFS) and Channel Bonding (CB). TFS and CB consist in the transmission of the data of a TV service across multiple RF channels instead of using a single channel. CB spreads data of a service over multiple classical RF channels (RF-Mux). TFS spreads the data by time-slicing (slot-by-slot) across multiple RF channels which are sequentially recovered at the receiver by frequency hopping. Transmissions using these features can benefit from capacity and coverage gains. The first one comes from a more efficient statistical multiplexing (StatMux) for Variable Bit Rate (VBR) services due to a StatMux pool over a higher number of services. Furthermore, CB allows increasing service data rate with the number of bonded RF channels and also advantages when combined with SVC (Scalable Video Coding). The coverage gain comes from the increased RF performance due to the reception of the data of a service from different RF channels rather that a single one that could be, eventually, degraded. Robustness against interferences is also improved since the received signal does not depend on a unique potentially interfered RF channel. TFS was firstly introduced as an informative annex in DVB-T2 (not normative) and adopted in DVB-NGH (Next Generation Handheld). TFS and CB are proposed for inclusion in ATSC 3.0. However, they have never been implemented. The investigations carried out in this dissertation employ an information-theoretical approach to obtain their upper bounds, physical layer simulations to evaluate the performance in real systems and the analysis of field measurements that approach realistic conditions of the network deployments. The analysis report coverage gains about 4-5 dB with 4 RF channels and high capacity gains already with 2 RF channels. This dissertation also focuses on implementation aspects. Channel bonding receivers require one tuner per bonded RF channel. The implementation of TFS with a single tuner demands the fulfilment of several timing requirements. However, the use of just two tuners would still allow for a good performance with a cost-effective implementation by the reuse of existing chipsets or the sharing of existing architectures with dual tuner operation such as MIMO (Multiple Input Multiple Output).[ES] La televisión digital terrestre (TDT) de última generación está orientada a una necesaria mejora de la eficiencia espectral con el fin de abordar los desafíos derivados de la escasez de espectro como resultado de la progresiva asignación de frecuencias - el llamado Dividendo Digital - para satisfacer la creciente demanda de capacidad para la banda ancha inalámbrica. Los avances tanto en los estándares de transmisión como de codificación de vídeo son de suma importancia para la progresiva puesta en marcha de servicios de alta calidad como la televisión de Ultra AD (Alta Definición). La transición al estándar europeo de segunda generación DVB-T2 y la introducción de la codificación de vídeo MPEG-4 / AVC ya permite la transmisión de 4-5 servicios de televisión de AD por canal RF (Radiofrecuencia). Sin embargo, la imposibilidad de asignar una mayor tasa de bit sobre el espectro restante podría poner en peligro la evolución de las plataformas de TDT en favor de otros sistemas de alta capacidad tales como el satélite o las distribuidoras de cable. El siguiente paso se centra en el despliegue del reciente estándar HEVC (High Efficiency Video Coding), que ofrece un 50% de ganancia de codificación con respecto a AVC, junto con los estándares terrestres de próxima generación, lo que podría garantizar la competitividad de la TDT en un futuro cercano. Esta tesis aborda el uso de tecnologías de agregación de canales RF que permitan incrementar la eficiencia espectral de las futuras redes. La tesis se centra en torno a dos tecnologías: Time Frequency Slicing (TFS) y Channel Bonding (CB). TFS y CB consisten en la transmisión de los datos de un servicio de televisión a través de múltiples canales RF en lugar de utilizar un solo canal. CB difunde los datos de un servicio a través de varios canales RF convencionales formando un RF-Mux. TFS difunde los datos a través de ranuras temporales en diferentes canales RF. Los datos son recuperados de forma secuencial en el receptor mediante saltos en frecuencia. La implementación de estas técnicas permite obtener ganancias en capacidad y cobertura. La primera de ellas proviene de una multiplexación estadística (StatMux) de servicios de tasa variable (VBR) más eficiente. Además, CB permite aumentar la tasa de pico de un servicio de forma proporcional al número de canales así como ventajas al combinarla con codificación de vídeo escalable. La ganancia en cobertura proviene de un mejor rendimiento RF debido a la recepción de los datos de un servicio desde diferentes canales en lugar uno sólo que podría estar degradado. Del mismo modo, es posible obtener una mayor robustez frente a interferencias ya que la recepción o no de un servicio no depende de si el canal que lo alberga está o no interferido. TFS fue introducido en primer lugar como un anexo informativo en DVB-T2 (no normativo) y posteriormente fue adoptado en DVB-NGH (Next Generation Handheld). TFS y CB han sido propuestos para su inclusión en ATSC 3.0. Aún así, nunca han sido implementados. Las investigaciones llevadas a cabo en esta Tesis emplean diversos enfoques basados en teoría de la información para obtener los límites de ganancia, en simulaciones de capa física para evaluar el rendimiento en sistemas reales y en el análisis de medidas de campo. Estos estudios reportan ganancias en cobertura en torno a 4-5 dB con 4 canales e importantes ganancias en capacidad aún con sólo 2 canales RF. Esta tesis también se centra en los aspectos de implementación. Los receptores para CB requieren un sintonizador por canal RF agregado. La implementación de TFS con un solo sintonizador exige el cumplimiento de varios requisito temporales. Sin embargo, el uso de dos sintonizadores permitiría un buen rendimiento con una implementación más rentable con la reutilización de los actuales chips o su introducción junto con las arquitecturas existentes que operan con un doble sintonizador tales como[CA] La televisió digital terrestre (TDT) d'última generació està orientada a una necessària millora de l'eficiència espectral a fi d'abordar els desafiaments derivats de l'escassetat d'espectre com a resultat de la progressiva assignació de freqüències - l'anomenat Dividend Digital - per a satisfer la creixent demanda de capacitat per a la banda ampla sense fil. Els avanços tant en els estàndards de transmissió com de codificació de vídeo són de la màxima importància per a la progressiva posada en marxa de serveis d'alta qualitat com la televisió d'Ultra AD (Alta Definició). La transició a l'estàndard europeu de segona generació DVB-T2 i la introducció de la codificació de vídeo MPEG-4/AVC ja permet la transmissió de 4-5 serveis de televisió d'AD per canal RF (Radiofreqüència). No obstant això, la impossibilitat d'assignar una major taxa de bit sobre l'espectre restant podria posar en perill l'evolució de les plataformes de TDT en favor d'altres sistemes d'alta capacitat com ara el satèl·lit o les distribuïdores de cable. El següent pas se centra en el desplegament del recent estàndard HEVC (High Efficiency Vídeo Coding), que oferix un 50% de guany de codificació respecte a AVC, junt amb els estàndards terrestres de pròxima generació, la qual cosa podria garantir la competitivitat de la TDT en un futur pròxim. Aquesta tesi aborda l'ús de tecnologies d'agregació de canals RF que permeten incrementar l'eficiència espectral de les futures xarxes. La tesi se centra entorn de dues tecnologies: Time Frequency Slicing (TFS) i Channel Bonding (CB). TFS i CB consistixen en la transmissió de les dades d'un servei de televisió a través de múltiples canals RF en compte d'utilitzar un sol canal. CB difon les dades d'un servei a través d'uns quants canals RF convencionals formant un RF-Mux. TFS difon les dades a través de ranures temporals en diferents canals RF. Les dades són recuperades de forma seqüencial en el receptor per mitjà de salts en freqüència. La implementació d'aquestes tècniques permet obtindre guanys en capacitat i cobertura. La primera d'elles prové d'una multiplexació estadística (StatMux) de serveis de taxa variable (VBR) més eficient. A més, CB permet augmentar la taxa de pic d'un servei de forma proporcional al nombre de canals així com avantatges al combinar-la amb codificació de vídeo escalable. El guany en cobertura prové d'un millor rendiment RF a causa de la recepció de les dades d'un servei des de diferents canals en lloc de només un que podria estar degradat. De la mateixa manera, és possible obtindre una major robustesa enfront d'interferències ja que la recepció o no d'un servei no depén de si el canal que l'allotja està o no interferit. TFS va ser introduït en primer lloc com un annex informatiu en DVB-T2 (no normatiu) i posteriorment va ser adoptat en DVB-NGH (Next Generation Handheld). TFS i CB han sigut proposades per a la seva inclusió en ATSC 3.0. Encara així, mai han sigut implementades. Les investigacions dutes a terme en esta Tesi empren diverses vessants basades en teoria de la informació per a obtindre els límits de guany, en simulacions de capa física per a avaluar el rendiment en sistemes reals i en l'anàlisi de mesures de camp. Aquestos estudis reporten guanys en cobertura entorn als 4-5 dB amb 4 canals i importants guanys en capacitat encara amb només 2 canals RF. Esta tesi també se centra en els aspectes d'implementació. Els receptors per a CB requerixen un sintonitzador per canal RF agregat. La implementació de TFS amb un sol sintonitzador exigix el compliment de diversos requisit temporals. No obstant això, l'ús de dos sintonitzadors permetria un bon rendiment amb una implementació més rendible amb la reutilització dels actuals xips o la seua introducció junt amb les arquitectures existents que operen amb un doble sintonitzador com ara MIMO (Multiple Input Multiple Output).Giménez Gandia, JJ. (2015). Improved Spectrum Usage with Multi-RF Channel Aggregation Technologies for the Next-Generation Terrestrial Broadcasting [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/52520TESI

    Efficient algorithms for scalable video coding

    Get PDF
    A scalable video bitstream specifically designed for the needs of various client terminals, network conditions, and user demands is much desired in current and future video transmission and storage systems. The scalable extension of the H.264/AVC standard (SVC) has been developed to satisfy the new challenges posed by heterogeneous environments, as it permits a single video stream to be decoded fully or partially with variable quality, resolution, and frame rate in order to adapt to a specific application. This thesis presents novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute Difference (MAD) prediction model. The proposed fast inter-frame and inter-layer mode selection algorithm is based on the empirical observation that a macroblock (MB) with slow movement is more likely to be best matched by one in the same resolution layer. However, for a macroblock with fast movement, motion estimation between layers is required. Simulation results show that the algorithm can reduce the encoding time by up to 40%, with negligible degradation in RD performance. The proposed hierarchical fast mode selection scheme comprises four levels and makes full use of inter-layer, temporal and spatial correlation aswell as the texture information of each macroblock. Overall, the new technique demonstrates the same coding performance in terms of picture quality and compression ratio as that of the SVC standard, yet produces a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode selection algorithms, the proposed algorithm achieves a superior computational time reduction under very similar RD performance conditions. The existing SVC rate distortion model cannot accurately represent the RD properties of the prediction modes, because it is influenced by the use of inter-layer prediction. A separate RD model for inter-layer prediction coding in the enhancement layer(s) is therefore introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy is maintained to within 0.07% on average. As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction model for the spatial enhancement layers is proposed that considers the MAD from previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction. Simulation results indicate that the proposedMADprediction model reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation
    corecore