23 research outputs found

    Fast Depth and Inter Mode Prediction for Quality Scalable High Efficiency Video Coding

    Get PDF
    International audienceThe scalable high efficiency video coding (SHVC) is an extension of high efficiency video coding (HEVC), which introduces multiple layers and inter-layer prediction, thus significantly increases the coding complexity on top of the already complicated HEVC encoder. In inter prediction for quality SHVC, in order to determine the best possible mode at each depth level, a coding tree unit can be recursively split into four depth levels, including merge mode, inter2Nx2N, inter2NxN, interNx2N, interNxN, in-ter2NxnU, inter2NxnD, internLx2N and internRx2N, intra modes and inter-layer reference (ILR) mode. This can obtain the highest coding efficiency, but also result in very high coding complexity. Therefore, it is crucial to improve coding speed while maintaining coding efficiency. In this research, we have proposed a new depth level and inter mode prediction algorithm for quality SHVC. First, the depth level candidates are predicted based on inter-layer correlation, spatial correlation and its correlation degree. Second, for a given depth candidate, we divide mode prediction into square and non-square mode predictions respectively. Third, in the square mode prediction, ILR and merge modes are predicted according to depth correlation, and early terminated whether residual distribution follows a Gaussian distribution. Moreover, ILR mode, merge mode and inter2Nx2N are early terminated based on significant differences in Rate Distortion (RD) costs. Fourth, if the early termination condition cannot be satisfied, non-square modes are further predicted based on significant differences in expected values of residual coefficients. Finally, inter-layer and spatial correlations are combined with residual distribution to examine whether to early terminate depth selection. Experimental results have demonstrated that, on average, the proposed algorithm can achieve a time saving of 71.14%, with a bit rate increase of 1.27%

    Efficient Coding Tree Unit (CTU) Decision Method for Scalable High-Efficiency Video Coding (SHVC) Encoder

    Get PDF
    High-efficiency video coding (HEVC or H.265) is the latest video compression standard developed by the joint collaborative team on video coding (JCT-VC), finalized in 2013. HEVC can achieve an average bit rate decrease of 50% in comparison with H.264/AVC while still maintaining video quality. To upgrade the HEVC used in heterogeneous access networks, the JVT-VC has been approved scalable extension of HEVC (SHVC) in July 2014. The SHVC can achieve the highest coding efficiency but requires a very high computational complexity such that its real-time application is limited. To reduce the encoding complexity of SHVC, in this chapter, we employ the temporal-spatial and inter-layer correlations between base layer (BL) and enhancement layer (EL) to predict the best quadtree of coding tree unit (CTU) for quality SHVC. Due to exist a high correlation between layers, we utilize the coded information from the CTU quadtree in BL, including inter-layer intra/residual prediction and inter-layer motion parameter prediction, to predict the CTU quadtree in EL. Therefore, we develop an efficient CTU decision method by combing temporal-spatial searching order algorithm (TSSOA) in BL and a fast inter-layer searching algorithm (FILSA) in EL to speed up the encoding process of SHVC. The simulation results show that the proposed efficient CTU decision method can achieve an average time improving ratio (TIR) about 52–78% and 47–69% for low delay (LD) and random access (RA) configurations, respectively. It is clear that the proposed method can efficiently reduce the computational complexity of SHVC encoder with negligible loss of coding efficiency with various types of video sequences

    Hybrid strategies for efficient intra prediction in spatial SHVC

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkWith multi-layer encoding and Inter-layer prediction, Spatial Scalable High Efficiency Video Coding (SSHVC) has extremely high coding complexity. It is very crucial to speed up its coding to promote widespread and cost-effective SSHVC applications. Specifically, we first reveal that the average RD cost of Inter-layer Reference (ILR) mode is different from that of Intra mode, but they both follow the Gaussian distribution. Based on this discovery, we apply the classic Gaussian Mixture Model and Expectation Maximization to determine whether ILR mode is the best mode thus skipping Intra mode. Second, when coding units (CUs) in enhancement layer use Intra mode, it indicates very simple texture is presented. We investigate their Directional Mode (DM) distribution, and divide all DMs into three classes, and then develop different methods with respect to classes to progressively predict the best DMs. Third, by jointly considering rate distortion costs, residual coefficients and neighboring CUs, we propose to employ the Conditional Random Fields model to early terminate depth selection. Experimental results demonstrate that the proposed algorithm can significantly improve coding speed with negligible coding efficiency losses

    Fast Mode Assignment for Quality Scalable Extension of the High Efficiency Video Coding (HEVC) Standard: A Bayesian Approach

    Get PDF
    ABSTRACT The new compression standard, known as the High Efficiency Video Coding (HEVC), aims at significantly improving the compression efficiency compared to previous standards. There has been significant interest in developing a scalable version of this standard. As expected, the HEVC scalable video version, which is called SHVC, increases the complexity of the codec compared to the non-scalable counterpart. In this paper, we propose an adaptive fast mode assigning method based on a Bayesian classifier that reduces SHVC's coding complexity by up to 68.55%, while maintaining the overall quality and bit-rates

    Quality of Experience (QoE)-Aware Fast Coding Unit Size Selection for HEVC Intra-prediction

    Get PDF
    The exorbitant increase in the computational complexity of modern video coding standards, such as High Efficiency Video Coding (HEVC), is a compelling challenge for resource-constrained consumer electronic devices. For instance, the brute force evaluation of all possible combinations of available coding modes and quadtree-based coding structure in HEVC to determine the optimum set of coding parameters for a given content demand a substantial amount of computational and energy resources. Thus, the resource requirements for real time operation of HEVC has become a contributing factor towards the Quality of Experience (QoE) of the end users of emerging multimedia and future internet applications. In this context, this paper proposes a content-adaptive Coding Unit (CU) size selection algorithm for HEVC intra-prediction. The proposed algorithm builds content-specific weighted Support Vector Machine (SVM) models in real time during the encoding process, to provide an early estimate of CU size for a given content, avoiding the brute force evaluation of all possible coding mode combinations in HEVC. The experimental results demonstrate an average encoding time reduction of 52.38%, with an average Bjøntegaard Delta Bit Rate (BDBR) increase of 1.19% compared to the HM16.1 reference encoder. Furthermore, the perceptual visual quality assessments conducted through Video Quality Metric (VQM) show minimal visual quality impact on the reconstructed videos of the proposed algorithm compared to state-of-the-art approaches

    Rinnakkainen toteutus H.265 videokoodaus standardille

    Get PDF
    The objective of this study was to research the scalability of the parallel features in the new H.265 video compression standard, also know as High Efficiency Video Coding (HEVC). Compared to its predecessor, the H.264 standard, H.265 typically achieves around 50% bitrate reduction for the same subjective video quality. Especially videos with higher resolution (Full HD and beyond) achieve better compression ratios. Also a better utilization of parallel computing resources is provided. H.265 introduces two novel parallelization features: Tiles and Wavefront Parallel Processing (WPP). In Tiles, each video frame is divided into areas that can be decoded without referencing to other areas in the same frame. In WPP, the relations between code blocks in a frame are encoded so that the decoding process can progress through the frame as a front using multiple threads. In this study, the reference implementation for the H.265 decoder was augmented to support both of these parallelization features. The performance of the parallel implementations was measured using three different setups. From the measurement results it could be seen that the introduction of more CPU cores reduced the total decode time of the video frames to a certain point. When using the Tiles feature, it was observed that the encoding geometry, i.e. how each frame was divided into individually decodable areas, had a noticeable effect on the decode times with certain thread counts. When using WPP, it was observed that what was mostly synchronization overhead, sometimes had a negative effect on the decode times when using larger (4-12) amounts of threads.Tämän tutkimuksen aiheena oli tutkia uuden H.265 videonpakkausstandardin (tunnetaan myös nimellä HEVC (engl. High Efficiency Video Coding)) rinnakkaisuusominaisuuksien skaalautuvuutta. Verrattuna edeltäjäänsä, H.264 videonpakkaustandardiin, H.265 tyypillisesti saavuttaa samalla kuvanlaadulla noin 50% pienemmän pakkauskoon. Erityisesti suuren resoluution videoilla (Full HD ja suuremmat) pakkaustehokkuuden paremmuus korostuu. Huomiota on kiinnitetty myös moniydinprosessoreiden hyödyntämiseen videokoodauksessa. H.265 tarjoaa kaksi uutta rinnakkaisuusominaisuutta: niin kutsutut Tiles- ja WPP-menetelmät (engl. \emph{Wavefront Parallel Processing}). Tiles-menetelmässä jokainen videon kuva jaetaan alueisiin, jotka voidaan purkaa viittaamatta saman kuvan muihin alueisiin. WPP-menetelmässä suhteet kuvan lohkoihin pakataan siten että purkamisprosessi pystyy etenemään kuvan läpi rintamana hyödyntäen useampia säikeitä. Tässä tutkimuksessa H.265 videodekooderin referenssitoteutusta laajennettiin tukemaan molempia näistä rinnakkaisuusominaisuuksista. Suorituskykyä mitattiin käyttäen kolmea eri mittausasetelmaa. Mittaustuloksista ilmeni, että prosessoriydinten lukumäärän kasvattaminen nopeutti videoiden purkamista tiettyyn pisteeseen asti. Tiles-menetelmää mitatessa havaittiin, että alueiden geometrialla, eli kuinka kuva jaettiin riippumattomiin alueisiin, on huomattava vaikutus purkamisnopeuteen tietyillä säiemäärillä. WPP-menetelmää mitattaessa havaittiin että korkeampiin säiemääriin (4-12) siirryttäessä purkamisnopeus alkoi hidastua. Tämä johtui pääasiassa säikeiden keskinäiseen synkronointiin kuluvasta ajasta

    Receiver-Driven Video Adaptation

    Get PDF
    In the span of a single generation, video technology has made an incredible impact on daily life. Modern use cases for video are wildly diverse, including teleconferencing, live streaming, virtual reality, home entertainment, social networking, surveillance, body cameras, cloud gaming, and autonomous driving. As these applications continue to grow more sophisticated and heterogeneous, a single representation of video data can no longer satisfy all receivers. Instead, the initial encoding must be adapted to each receiver's unique needs. Existing adaptation strategies are fundamentally flawed, however, because they discard the video's initial representation and force the content to be re-encoded from scratch. This process is computationally expensive, does not scale well with the number of videos produced, and throws away important information embedded in the initial encoding. Therefore, a compelling need exists for the development of new strategies that can adapt video content without fully re-encoding it. To better support the unique needs of smart receivers, diverse displays, and advanced applications, general-use video systems should produce and offer receivers a more flexible compressed representation that supports top-down adaptation strategies from an original, compressed-domain ground truth. This dissertation proposes an alternate model for video adaptation that addresses these challenges. The key idea is to treat the initial compressed representation of a video as the ground truth, and allow receivers to drive adaptation by dynamically selecting which subsets of the captured data to receive. In support of this model, three strategies for top-down, receiver-driven adaptation are proposed. First, a novel, content-agnostic entropy coding technique is implemented in which symbols are selectively dropped from an input abstract symbol stream based on their estimated probability distributions to hit a target bit rate. Receivers are able to guide the symbol dropping process by supplying the encoder with an appropriate rate controller algorithm that fits their application needs and available bandwidths. Next, a domain-specific adaptation strategy is implemented for H.265/HEVC coded video in which the prediction data from the original source is reused directly in the adapted stream, but the residual data is recomputed as directed by the receiver. By tracking the changes made to the residual, the encoder can compensate for decoder drift to achieve near-optimal rate-distortion performance. Finally, a fully receiver-driven strategy is proposed in which the syntax elements of a pre-coded video are cataloged and exposed directly to clients through an HTTP API. Instead of requesting the entire stream at once, clients identify the exact syntax elements they wish to receive using a carefully designed query language. Although an implementation of this concept is not provided, an initial analysis shows that such a system could save bandwidth and computation when used by certain targeted applications.Doctor of Philosoph

    High-Level Synthesis Based VLSI Architectures for Video Coding

    Get PDF
    High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified
    corecore