18 research outputs found

    Adaptive Quantisation in HEVC for Contouring Artefacts Removal in UHD Content

    Get PDF
    Contouring artefacts affect the visual experience of some particular types of compressed Ultra High Definition (UHD) sequences characterised by smoothly textured areas and gradual transitions in the value of the pixels. This paper proposes a technique to adjust the quantisation process at the encoder so that contouring artefacts are avoided. The devised method does not require any change at the decoder side and introduces a negligible coding rate increment (up to 3.4% for the same objective quality). This result compares favourably with the average 11.2% bit-rate penalty introduced by a method where the quantisation step is reduced in contour-prone areas

    Video compression algorithms for HEVC and beyond

    Get PDF
    PhDDue to the increasing number of new services and devices that allow the creation, distribution and consumption of video content, the amount of video information being transmitted all over the world is constantly growing. Video compression technology is essential to cope with the ever increasing volume of digital video data being distributed in today's networks, as more e cient video compression techniques allow support for higher volumes of video data under the same memory/bandwidth constraints. This is especially relevant with the introduction of new and more immersive video formats associated with signi cantly higher amounts of data. In this thesis, novel techniques for improving the e ciency of current and future video coding technologies are investigated. Several aspects that in uence the way conventional video coding methods work are considered. In particular, the properties and limitations of the Human Visual System are exploited to tune the performance of video encoders towards better subjective quality. Additionally, it is shown how the visibility of speci c types of visual artefacts can be prevented during the video encoding process, in order to avoid subjective quality degradations in the compressed content. Techniques for higher video compression e ciency are also explored, targeting to improve the compression capabilities of state-of-the-art video coding standards. Finally, the application of video coding technologies to practical use-cases is considered. Accurate estimation models are devised to control the encoding time and bit rate associated with compressed video signals, in order to meet speci c encoding time and transmission time restrictions

    High efficiency compression for object detection

    Full text link
    Image and video compression has traditionally been tailored to human vision. However, modern applications such as visual analytics and surveillance rely on computers seeing and analyzing the images before (or instead of) humans. For these applications, it is important to adjust compression to computer vision. In this paper we present a bit allocation and rate control strategy that is tailored to object detection. Using the initial convolutional layers of a state-of-the-art object detector, we create an importance map that can guide bit allocation to areas that are important for object detection. The proposed method enables bit rate savings of 7% or more compared to default HEVC, at the equivalent object detection rate.Comment: The paper is published in IEEE ICASSP 18

    Fast Algorithms for HEVC Rate-Distortion Optimization

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2013. 8. 이혁재.디지털 영상 기기의 발전과 더불어, 고화질 영상에 대한 수요 또한 함께 증가하고 있다. 최근의 스마트폰과 태블릿 PC의 급속적인 성장은 이러한 추세를 가속화 시키고 있다. 이러한 변화에 맞추어, 고화질 영상 압축을 위한 새로운 영상 압축 기술의 표준화가 ISO/IEC MPEG과 ITU-T/VCEG의 공동의 팀으로 진행되어 왔다. HEVC는 H.264/AVC의 뒤를 잇는 차세대 영상 압축 표준 기술로서, 2013년 1월 FDIS (Final Draft International Standard)가 작성되면서, 표준화 과정이 완료되었다. HEVC는 H.264/AVC 대비 같은 화질의 영상을 절반의 비트량으로 압축하는 것을 목표로 하였으며, 이런 목표를 달성하기 위해, 새로운 기술들이 제안되었다. 특히, 복잡한 block 구조와 크게 늘어난 mode의 수는 영상 압축의 효율을 향상시키는 데에 크게 기여를 하였고, 이는 최적의 mode를 결정하는 RDO (Rate-Distortion Optimization)가 더욱 중요한 역할을 하도록 만들었다. 그러나, 복잡해진 block 구조는 RDO의 연산량 또한 크게 증가시켰다. 이러한 이유로, H.264/AVC와 달리 HEVC에서는 RDO의 연산량을 줄이면서 압축 효율을 유지하는 것이 중요한 이슈가 되었다. 본 논문에서는, H.264/AVC와 HEVC에서의 RDO에 의한 RD 저하의 차이를 실험 결과를 통해 제시하여 문제를 정의하고, RDO의 연산량을 줄이는 알고리즘들을 세 가지 연구 방향을 통해 제안하였다. 첫 번째 방향의 연구에서는 RDO의 과정을 구성하는 Transform, Quantization, Inverse Quantization, Inverse Transform 그리고 Entropy Coder 등의 일련의 과정의 연산을 단순화하는 알고리즘들이 제안되었다. 이러한 알고리즘은 기본적으로 H.264/AVC에서 이루어진 연구를 기반으로 하였고, 기존 알고리즘의 한계 또한 분석되어 성능을 향상시켰다. 더 나아가서는, 좀 더 공격적으로 RDO의 연산량을 줄일 수 있는 새로운 방법을 제안하였다. 두 번째 방향의 연구에서는 Zero Block detection이라는 기술을 기반으로, HEVC에 적합하게 RDO의 연산을 줄이는 방법을 제안하였다. H.264/AVC에서 제안되었던 알고리즘들은 HEVC에서의 Zero Block을 특징을 제대로 반영하지 못하기 때문에, 단순 수정을 통해 HEVC에 적용할 경우 기대한 만큼의 성능을 얻을 수 없다. 이러한 한계점을 해결하여 HEVC에 적합한 효율적인 Zero Block detection 알고리즘이 제시되었다. 세 번째 방향의 연구에서는, SATD 기반의 RDO를 활용하여, SSE 기반의 RDO의 연산량을 줄이는 방법을 제안하였다. SATD 기반의 RDO와 SSE 기반의 RDO의 차이점 분석과 실험 결과 바탕으로 효율적으로 SATD 기반의 RDO을 활용하는 방법이 제시되었다. 이렇게 제안된 알고리즘들은 HEVC의 reference software인 HM에 구현되어, RDO의 연산량을 크게 줄이면서도, RD 저하가 크게 증가하지 않는 실험 결과를 보이고 있다.초록 iii 목차 v 표 목차 viii 그림 목차 x 제 1 장 서론 1 1.1 연구 배경 1 1.2 연구 내용 3 1.3 논문 구성 6 제 2 장 배경지식과 이전 연구 7 2.1 배경지식 7 2.2 이전 연구 15 제 3 장 Simplified RDO 19 3.1 Simplified SSE 19 3.2 Simplified CABAC 24 3.2.1 CABAC의 구조 24 3.2.2 Various Complexity CABAC 25 3.2.2.1 High-Complexity CABAC 25 3.2.2.2 Medium-Complexity CABAC 26 3.2.2.3 Low-Complexity CABAC 22 3.2.2.4 Evaluation of Various Complexity CABAC 29 3.2.3 Low-Complexity CABAC for HEVC 30 3.3 Advanced Simplified SSE & CABAC 37 3.3.1 Threshold Algorithm 37 3.3.2 Simplified SSE & CABAC without Transform 41 3.4 Evaluation 48 제 4 장 Zero Block Detection 51 4.1 Extension of H.264/AVC Zero Block Detection for HEVC 51 4.1.1 Characteristics of the zero blocks in HEVC 51 4.1.2 ZB detection by an extension of the H.264/AVC algorithm 54 4.2 Zreo Block Detection for HEVC 59 4.2.1 GZB Detection for 16x16 and 32x32 transforms 59 4.2.2 Relaxed conditions for PZB detection 62 4.2.3 Further complexity reduction with SAD(or SATD) test 65 4.2.4 Proposed ZB detection for HEVC 72 4.3 Evaluation 74 제 5 장 SATD based RDO EVALUATION 84 5.1 Difference between SSE based RDO and SATD based RDO 84 5.2 SATD based RDO Evaluation for HEVC 88 5.3 Evaluation 93 제 6 장 결론 95Docto

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    SSIM-Inspired Quality Assessment, Compression, and Processing for Visual Communications

    Get PDF
    Objective Image and Video Quality Assessment (I/VQA) measures predict image/video quality as perceived by human beings - the ultimate consumers of visual data. Existing research in the area is mainly limited to benchmarking and monitoring of visual data. The use of I/VQA measures in the design and optimization of image/video processing algorithms and systems is more desirable, challenging and fruitful but has not been well explored. Among the recently proposed objective I/VQA approaches, the structural similarity (SSIM) index and its variants have emerged as promising measures that show superior performance as compared to the widely used mean squared error (MSE) and are computationally simple compared with other state-of-the-art perceptual quality measures. In addition, SSIM has a number of desirable mathematical properties for optimization tasks. The goal of this research is to break the tradition of using MSE as the optimization criterion for image and video processing algorithms. We tackle several important problems in visual communication applications by exploiting SSIM-inspired design and optimization to achieve significantly better performance. Firstly, the original SSIM is a Full-Reference IQA (FR-IQA) measure that requires access to the original reference image, making it impractical in many visual communication applications. We propose a general purpose Reduced-Reference IQA (RR-IQA) method that can estimate SSIM with high accuracy with the help of a small number of RR features extracted from the original image. Furthermore, we introduce and demonstrate the novel idea of partially repairing an image using RR features. Secondly, image processing algorithms such as image de-noising and image super-resolution are required at various stages of visual communication systems, starting from image acquisition to image display at the receiver. We incorporate SSIM into the framework of sparse signal representation and non-local means methods and demonstrate improved performance in image de-noising and super-resolution. Thirdly, we incorporate SSIM into the framework of perceptual video compression. We propose an SSIM-based rate-distortion optimization scheme and an SSIM-inspired divisive optimization method that transforms the DCT domain frame residuals to a perceptually uniform space. Both approaches demonstrate the potential to largely improve the rate-distortion performance of state-of-the-art video codecs. Finally, in real-world visual communications, it is a common experience that end-users receive video with significantly time-varying quality due to the variations in video content/complexity, codec configuration, and network conditions. How human visual quality of experience (QoE) changes with such time-varying video quality is not yet well-understood. We propose a quality adaptation model that is asymmetrically tuned to increasing and decreasing quality. The model improves upon the direct SSIM approach in predicting subjective perceptual experience of time-varying video quality

    Low complexity in-loop perceptual video coding

    Get PDF
    The tradition of broadcast video is today complemented with user generated content, as portable devices support video coding. Similarly, computing is becoming ubiquitous, where Internet of Things (IoT) incorporate heterogeneous networks to communicate with personal and/or infrastructure devices. Irrespective, the emphasises is on bandwidth and processor efficiencies, meaning increasing the signalling options in video encoding. Consequently, assessment for pixel differences applies uniform cost to be processor efficient, in contrast the Human Visual System (HVS) has non-uniform sensitivity based upon lighting, edges and textures. Existing perceptual assessments, are natively incompatible and processor demanding, making perceptual video coding (PVC) unsuitable for these environments. This research allows existing perceptual assessment at the native level using low complexity techniques, before producing new pixel-base image quality assessments (IQAs). To manage these IQAs a framework was developed and implemented in the high efficiency video coding (HEVC) encoder. This resulted in bit-redistribution, where greater bits and smaller partitioning were allocated to perceptually significant regions. Using a HEVC optimised processor the timing increase was < +4% and < +6% for video streaming and recording applications respectively, 1/3 of an existing low complexity PVC solution. Future work should be directed towards perceptual quantisation which offers the potential for perceptual coding gain

    Feasibility Study of High-Level Synthesis : Implementation of a Real-Time HEVC Intra Encoder on FPGA

    Get PDF
    High-Level Synthesis (HLS) on automatisoitu suunnitteluprosessi, joka pyrkii parantamaan tuottavuutta perinteisiin suunnittelumenetelmiin verrattuna, nostamalla suunnittelun abstraktiota rekisterisiirtotasolta (RTL) käyttäytymistasolle. Erilaisia kaupallisia HLS-työkaluja on ollut markkinoilla aina 1990-luvulta lähtien, mutta vasta äskettäin ne ovat alkaneet saada hyväksyntää teollisuudessa sekä akateemisessa maailmassa. Hidas käyttöönottoaste on johtunut pääasiassa huonommasta tulosten laadusta (QoR) kuin mitä on ollut mahdollista tavanomaisilla laitteistokuvauskielillä (HDL). Uusimmat HLS-työkalusukupolvet ovat kuitenkin kaventaneet QoR-aukkoa huomattavasti. Tämä väitöskirja tutkii HLS:n soveltuvuutta videokoodekkien kehittämiseen. Se esittelee useita HLS-toteutuksia High Efficiency Video Coding (HEVC) -koodaukselle, joka on keskeinen mahdollistava tekniikka lukuisille nykyaikaisille mediasovelluksille. HEVC kaksinkertaistaa koodaustehokkuuden edeltäjäänsä Advanced Video Coding (AVC) -standardiin verrattuna, saavuttaen silti saman subjektiivisen visuaalisen laadun. Tämä tyypillisesti saavutetaan huomattavalla laskennallisella lisäkustannuksella. Siksi reaaliaikainen HEVC vaatii automatisoituja suunnittelumenetelmiä, joita voidaan käyttää rautatoteutus- (HW ) ja varmennustyön minimoimiseen. Tässä väitöskirjassa ehdotetaan HLS:n käyttöä koko enkooderin suunnitteluprosessissa. Dataintensiivisistä koodaustyökaluista, kuten intra-ennustus ja diskreetit muunnokset, myös enemmän kontrollia vaativiin kokonaisuuksiin, kuten entropiakoodaukseen. Avoimen lähdekoodin Kvazaar HEVC -enkooderin C-lähdekoodia hyödynnetään tässä työssä referenssinä HLS-suunnittelulle sekä toteutuksen varmentamisessa. Suorituskykytulokset saadaan ja raportoidaan ohjelmoitavalla porttimatriisilla (FPGA). Tämän väitöskirjan tärkein tuotos on HEVC intra enkooderin prototyyppi. Prototyyppi koostuu Nokia AirFrame Cloud Server palvelimesta, varustettuna kahdella 2.4 GHz:n 14-ytiminen Intel Xeon prosessorilla, sekä kahdesta Intel Arria 10 GX FPGA kiihdytinkortista, jotka voidaan kytkeä serveriin käyttäen joko peripheral component interconnect express (PCIe) liitäntää tai 40 gigabitin Ethernettiä. Prototyyppijärjestelmä saavuttaa reaaliaikaisen 4K enkoodausnopeuden, jopa 120 kuvaa sekunnissa. Lisäksi järjestelmän suorituskykyä on helppo skaalata paremmaksi lisäämällä järjestelmään käytännössä minkä tahansa määrän verkkoon kytkettäviä FPGA-kortteja. Monimutkaisen HEVC:n tehokas mallinnus ja sen monipuolisten ominaisuuksien mukauttaminen reaaliaikaiselle HW HEVC enkooderille ei ole triviaali tehtävä, koska HW-toteutukset ovat perinteisesti erittäin aikaa vieviä. Tämä väitöskirja osoittaa, että HLS:n avulla pystytään nopeuttamaan kehitysaikaa, tarjoamaan ennen näkemätöntä suunnittelun skaalautuvuutta, ja silti osoittamaan kilpailukykyisiä QoR-arvoja ja absoluuttista suorituskykyä verrattuna olemassa oleviin toteutuksiin.High-Level Synthesis (HLS) is an automated design process that seeks to improve productivity over traditional design methods by increasing design abstraction from register transfer level (RTL) to behavioural level. Various commercial HLS tools have been available on the market since the 1990s, but only recently they have started to gain adoption across industry and academia. The slow adoption rate has mainly stemmed from lower quality of results (QoR) than obtained with conventional hardware description languages (HDLs). However, the latest HLS tool generations have substantially narrowed the QoR gap. This thesis studies the feasibility of HLS in video codec development. It introduces several HLS implementations for High Efficiency Video Coding (HEVC) , that is the key enabling technology for numerous modern media applications. HEVC doubles the coding efficiency over its predecessor Advanced Video Coding (AVC) standard for the same subjective visual quality, but typically at the cost of considerably higher computational complexity. Therefore, real-time HEVC calls for automated design methodologies that can be used to minimize the HW implementation and verification effort. This thesis proposes to use HLS throughout the whole encoder design process. From data-intensive coding tools, like intra prediction and discrete transforms, to more control-oriented tools, such as entropy coding. The C source code of the open-source Kvazaar HEVC encoder serves as a design entry point for the HLS flow, and it is also utilized in design verification. The performance results are gathered with and reported for field programmable gate array (FPGA) . The main contribution of this thesis is an HEVC intra encoder prototype that is built on a Nokia AirFrame Cloud Server equipped with 2.4 GHz dual 14-core Intel Xeon processors and two Intel Arria 10 GX FPGA Development Kits, that can be connected to the server via peripheral component interconnect express (PCIe) generation 3 or 40 Gigabit Ethernet. The proof-of-concept system achieves real-time. 4K coding speed up to 120 fps, which can be further scaled up by adding practically any number of network-connected FPGA cards. Overcoming the complexity of HEVC and customizing its rich features for a real-time HEVC encoder implementation on hardware is not a trivial task, as hardware development has traditionally turned out to be very time-consuming. This thesis shows that HLS is able to boost the development time, provide previously unseen design scalability, and still result in competitive performance and QoR over state-of-the-art hardware implementations

    Local Inverse Tone Curve Learning for High Dynamic Range Image Scalable Compression

    Get PDF
    International audienceThis paper presents a scalable high dynamic range (HDR) image coding scheme in which the base layer is a lowdynamic range (LDR) version of the image that may have been generated by an arbitrary Tone Mapping Operator (TMO). No restriction is imposed on the TMO, which can be either global or local, so as to fully respect the artistic intent of the producer. Our method successfully handles the case of complex local TMOs thanks to a block-wise and non-linear approach. A novel template based Inter Layer Prediction (ILP) is designed in order to perform the inverse tone mapping of a block without the need to transmit any additional parameter to the decoder. This method enables the use of a more accurate inverse tone mapping model than the simple linear regression commonly used for blockwise ILP. In addition, this paper shows that a linear adjustment of the initially predicted block can further improve the overall coding performance by using an efficient encoding scheme of the scaling parameters. Our experiments have shown an average bitrate saving of 47% on the HDR enhancement layer, compared to previous local ILP methods
    corecore