21 research outputs found

    PEA265: Perceptual Assessment of Video Compression Artifacts

    Full text link
    The most widely used video encoders share a common hybrid coding framework that includes block-based motion estimation/compensation and block-based transform coding. Despite their high coding efficiency, the encoded videos often exhibit visually annoying artifacts, denoted as Perceivable Encoding Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience (QoE) of end users. To monitor and improve visual QoE, it is crucial to develop subjective and objective measures that can identify and quantify various types of PEAs. In this work, we make the first attempt to build a large-scale subjectlabelled database composed of H.265/HEVC compressed videos containing various PEAs. The database, namely the PEA265 database, includes 4 types of spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types of temporal PEAs (i.e. flickering and floating). Each containing at least 60,000 image or video patches with positive and negative labels. To objectively identify these PEAs, we train Convolutional Neural Networks (CNNs) using the PEA265 database. It appears that state-of-theart ResNeXt is capable of identifying each type of PEAs with high accuracy. Furthermore, we define PEA pattern and PEA intensity measures to quantify PEA levels of compressed video sequence. We believe that the PEA265 database and our findings will benefit the future development of video quality assessment methods and perceptually motivated video encoders.Comment: 10 pages,15 figures,4 table

    Visual Content Characterization Based on Encoding Rate-Distortion Analysis

    Get PDF
    Visual content characterization is a fundamentally important but under exploited step in dataset construction, which is essential in solving many image processing and computer vision problems. In the era of machine learning, this has become ever more important, because with the explosion of image and video content nowadays, scrutinizing all potential content is impossible and source content selection has become increasingly difficult. In particular, in the area of image/video coding and quality assessment, it is highly desirable to characterize/select source content and subsequently construct image/video datasets that demonstrate strong representativeness and diversity of the visual world, such that the visual coding and quality assessment methods developed from and validated using such datasets exhibit strong generalizability. Encoding Rate-Distortion (RD) analysis is essential for many multimedia applications. Examples of applications that explicitly use RD analysis include image encoder RD optimization, video quality assessment (VQA), and Quality of Experience (QoE) optimization of streaming videos etc. However, encoding RD analysis has not been well investigated in the context of visual content characterization. This thesis focuses on applying encoding RD analysis as a visual source content characterization method with image/video coding and quality assessment applications in mind. We first conduct a video quality subjective evaluation experiment for state-of-the-art video encoder performance analysis and comparison, where our observations reveal severe problems that motivate the needs of better source content characterization and selection methods. Then the effectiveness of RD analysis in visual source content characterization is demonstrated through a proposed quality control mechanism for video coding by eigen analysis in the space of General Quality Parameter (GQP) functions. Finally, by combining encoding RD analysis with submodular set function optimization, we propose a novel method for automating the process of representative source content selection, which helps boost the RD performance of visual encoders trained with the selected visual contents

    The Optimization of Context-based Binary Arithmetic Coding in AVS2.0

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐ์ •๋ณด๊ณตํ•™๋ถ€, 2016. 2. ์ฑ„์ˆ˜์ต.HEVC(High Efficiency Video Coding)๋Š” ์ง€๋‚œ ์ œ๋„ˆ๋ ˆ์ด์…˜ ํ‘œ์ค€ H.264/AVC๋ณด๋‹ค ์ฝ”๋”ฉ ํšจ์œจ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ๋ฅผ ์œ„ํ•ด์„œ ๊ตญ์ œ ํ‘œ์ค€ ์กฐ์ง๊ณผ(International Standard Organization) ๊ตญ์ œ ์ „๊ธฐ ํ†ต์‹  ์—ฐํ•ฉ(International Telecommunication Union)์— ์˜ํ•ด ๊ณต๋™์œผ๋กœ ๊ฐœ๋ฐœ๋œ ๊ฒƒ์ด๋‹ค. ์ค‘๊ตญ ์ž‘์—… ๊ทธ๋ฃน์ธ AVS(Audio and Video coding standard)๊ฐ€ ์ด๋ฏธ ๋น„์Šทํ•œ ๋…ธ๋ ฅ์„ ๋ฐ”์ณค๋‹ค. ๊ทธ๋“ค์ด ๋งŽ์ด ์ฐฝ์˜์ ์ธ ์ฝ”๋”ฉ ๋„๊ตฌ๋ฅผ ์šด์šฉํ•œ ์ฒซ ์ œ๋„ˆ๋ ˆ์ด์…˜ AVS1์˜ ์••์ถ• ํผํฌ๋จผ์Šค๋ฅผ ๋†’์ด๋„๋ก ์ตœ์‹ ์˜ ์ฝ”๋”ฉ ํ‘œ์ค€(AVS2 or AVS2.0)์„ ๊ฐœ๋ฐœํ–ˆ๋‹ค. AVS2.0 ์ค‘์— ์—”ํŠธ๋กœํ”ผ ์ฝ”๋”ฉ ๋„๊ตฌ๋กœ ์‚ฌ์šฉ๋œ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ์ฝ”๋”ฉ(CBAC)์€ ์ „์ฒด์  ์ฝ”๋”ฉ ํ‘œ์ค€ ์ค‘์—์„œ ์ค‘์š”ํ•œ ์—ญํ•˜๋ฅผ ํ–ˆ๋‹ค. HEVC์—์„œ ์ฑ„์šฉ๋œ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ ์กฐ์ •์˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ์ฝ”๋”ฉ(CABAC)๊ณผ ๋น„์Šทํ•˜๊ฒŒ ์ด ๋‘ ์ฝ”๋”ฉ์€ ๋‹ค ์Šน์ˆ˜ ์ž์œ  ๋ฐฉ๋ฒ•์„ ์ฑ„์šฉํ•ด์„œ ๊ณ„์‚ฐ ์ฝ”๋”ฉ์„ ํ˜„์‹คํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๊ฐ ์ฝ”๋”ฉ๋งˆ๋‹ค ๊ฐ์ž์˜ ํŠน์ •ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๊ณฑ์…ˆ ๋ฌธ์ œ๋ฅผ ์ฒ˜๋ฆฌํ•œ ๊ฒƒ์ด๋‹ค. ๋ณธ์ง€๋Š” AVS2.0์ค‘์˜ CBAC์— ๋Œ€ํ•œ ๋” ๊นŠ์ด ์ดํ•ด์™€ ๋” ์ข‹์€ ํผํฌ๋จผ์Šค ๊ฐœ์„ ์˜ ๋ชฉ์ ์œผ๋กœ 3๊ฐ€์ง€ ์ธก๋ฉด์˜ ์ผ์„ ํ•œ๋‹ค. ์ฒซ์งธ, ์šฐ๋ฆฌ๊ฐ€ ํ•œ ๋น„๊ต ์ œ๋„๋ฅผ ๋‹ค์ž์ธ์„ ํ•ด์„œ AVS2.0ํ”Œ๋žซํผ ์ค‘์˜ CBAC์™€ CABAC๋ฅผ ๋น„๊ตํ–ˆ๋‹ค. ๋‹ค๋ฅธ ์‹คํ–‰ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ๊ณ ๋ คํ•˜์—ฌ HEVC์ค‘์˜ CABAC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ AVS2.0์— ์ด์‹ํ•œ๋‹ค.์˜ˆ๋ฅผ ๋“ค๋ฉด, ์ƒํ™ฉ ๊ธฐ๋ฐ˜ ์ดˆ๊ธฐ์น˜๊ฐ€ ๋‹ค๋ฅด๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” CBAC๊ฐ€ ๋” ์ข‹์€ ์ฝ”๋”ฉ ํผํฌ๋จผ์Šค๋ฅผ ๋‹ฌ์„ฑํ•œ๋‹ค๊ณ  ์•Œ๋ ค์ง„๋‹ค. ๊ทธ ๋‹ค์Œ์— CBAC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ตœ์ ํ™”์‹œํ‚ค๊ธฐ๋ฅผ ์œ„ํ•ด์„œ ๋ช‡ ๊ฐ€์ง€ ์•„์ด๋””์–ด๋ฅผ ์ œ์•ˆํ•˜๊ฒŒ ๋๋‹ค. ์ฝ”๋”ฉ ํผํฌ๋จผ์Šค ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ์˜ ๋ชฉ์ ์œผ๋กœ ๊ทผ์‚ฌ ์˜ค์ฐจ ๋ณด์ƒ(approximation error compensation)๊ณผ ํ™•๋ฅ  ์ถ”์ • ์ตœ์ ํ™”(probability estimation)๋ฅผ ๋„์ž…ํ–ˆ๋‹ค. ๋‘ ์ฝ”๋”ฉ์€ ๋‹ค๋ฅธ ์•ต์ปค๋ณด๋‹ค ๋‹ค ๋ถ€ํ˜ธํ™”ํšจ์œจ ํ–ฅ์ƒ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ฒŒ ๋๋‹ค. ๋‹ค๋ฅธ ํ•œํŽธ์œผ๋กœ๋Š” ์ฝ”๋”ฉ ์‹œ๊ฐ„์„ ์ค„์ด๊ธฐ๋ฅผ ์œ„ํ•˜์—ฌ ๋ ˆํ…Œ ์ถ”์ • ๋ชจ๋ธ(rate estimation model)๋„ ์ œ์•ˆํ•˜๊ฒŒ ๋œ๋‹ค. ๋ถ€ํ˜ธ์œจ-๋ณ€ํ˜• ์ตœ์ ํ™” ๊ณผ์ •(Rate-Distortion Optimization process)์˜ ๋ถ€ํ˜ธ์œจ-๋ณ€ํ˜• ๋Œ€๊ฐ€ ๊ณ„์‚ฐ(Rate-distortion cost calculation)์„ ์ง€์ง€ํ•˜๋„๋ก ๋ฆฌ์–ผ CBAC ์•Œ๊ณ ๋ฆฌ์ฆ˜(real CBAC algorithm) ๋ ˆํ…Œ ์ถ”์ •(rate estimation)์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”(decoder) ์‹คํ–‰ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์„œ์ˆ ํ–ˆ๋‹ค. AVS2.0 ์ค‘์˜ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”ฉ(CBAD)์ด ๋„ˆ๋ฌด ๋งŽ์ด ๋ฐ์ดํ„ฐ ์ข…์†์„ฑ๊ณผ ๊ณ„์‚ฐ ๋ถ€๋‹ด์„ ๋„์ž…ํ•˜๊ธฐ ๋•Œ๋ฌธ์— 2๊ฐœ ํ˜น์€ 2๊ฐœ ์ด์ƒ์˜ bin ํ‰ํ–‰ ๋””์ฝ”๋”ฉ์ธ ์ฒ˜๋ฆฌ๋Ÿ‰(CBAD)์„ ๋””์ž์ธ์„ ํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”ฉ์˜ one-bin ์ œ๋„๋„ ์—ฌ๊ธฐ์„œ ๋””์ž์ธ์„ ํ•˜๊ฒŒ ๋๋‹ค. ํ˜„์žฌ๊นŒ์ง€ AVS์˜ CBAD ๊ธฐ์กด ๋””์ž์ธ์ด ์—†๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์šฐ๋ฆฌ์˜ ๋‹ค์ž์ธ์„ ๊ด€๋ จ๋œ HEVC์˜ ์—ฐ๊ตฌ์™€ ๋น„๊ตํ•˜์—ฌ ์„ค๋“๋ ฅ์ด ๊ฐ•ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ๋‹ค.High Efficiency Video Coding (HEVC) was jointly developed by the International Standard Organization (ISO) and International Telecommunication Union (ITU) to improve the coding efficiency further compared with last generation standard H.264/AVC. The similar efforts have been devoted by the Audio and Video coding Standard (AVS) Workgroup of China. They developed the newest video coding standard (AVS2 or AVS2.0) in order to enhance the compression performance of the first generation AVS1 with many novel coding tools. The Context-based Binary Arithmetic Coding (CBAC) as the entropy coding tool used in the AVS2.0 plays a vital role in the overall coding standard. Similar with Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted by HEVC, both of them employ the multiplier-free method to realize the arithmetic coding procedure. However, each of them develops the respective specific algorithm to deal with multiplication problem. In this work, there are three aspects work we have done in order to understand CBAC in AVS2.0 better and try to explore more performance improvement. Firstly, we design a comparison scheme to compare the CBAC and CABAC in the AVS2.0 platform. The CABAC algorithm in HEVC was transplanted into AVS2.0 with consideration about the different implementation detail. For example, the context initialization. The experiment result shows that the CBAC achieves better coding performance. Then several ideas to optimize the CBAC algorithm in AVS2.0 were proposed. For coding performance improvement, the proposed approximation error compensation and probability estimation optimization were introduced. Both of these two coding tools obtain coding efficiency improvement compared with the anchor. In the other aspect, the rate estimation model was proposed to reduce the coding time. Using rate estimation instead of the real CBAC algorithm to support the Rate-distortion cost calculation in Rate-Distortion Optimization (RDO) process, can significantly save the coding time due to the computation complexity of CBAC in nature. Lastly, the binary arithmetic decoder implementation detail was described. Since Context-based Binary Arithmetic Decoding (CBAD) in AVS2.0 introduces too much strong data dependence and computation burden, it is difficult to design a high throughput CBAD with 2 bins or more decoded in parallel. Currently, one-bin scheme of binary arithmetic decoder was designed in this work. Even through there is no previous design for CBAD of AVS up to now, we compare our design with other relative works for HEVC, and our design achieves a compelling experiment result.Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Key Techniques in AVS2.0 3 1.3 Research Contents 9 1.3.1 Performance Comparison of CBAC 9 1.3.2 CBAC Performance Improvement 10 1.3.3 Implementation of Binary Arithmetic Decoder in CBAC 12 1.4 Organization 12 Chapter 2 Entropy Coder CBAC in AVS2.0 14 2.1 Introduction of Entropy Coding 14 2.2 CBAC Overview 16 2.2.1 Binarization and Generation of Bin String 17 2.2.2 Context Modeling and Probability Estimation 19 2.2.3 Binary Arithmetic Coding Engine 22 2.3 Two-level Scan Coding CBAC in AVS2.0 26 2.3.1 Scan order 28 2.3.2 First level coding 30 2.3.3 Second level coding 31 2.4 Summary 32 Chapter 3 Performance Comparison in CBAC 34 3.1 Differences between CBAC and CABAC 34 3.2 Comparison of Two BAC Engines 36 3.2.1 Statistics and initialization of Context Models 37 3.2.2 Adaptive Initialization Probability 40 3.3 Experiment Result 41 3.4 Conclusion 42 Chapter 4 CBAC Performance Improvement 43 4.1 Approximation Error Compensation 43 4.1.1 Error Compensation Table 43 4.1.2 Experiment Result 48 4.2 Probability Estimation Model Optimization 48 4.2.1 Probability Estimation 48 4.2.2 Probability Estimation Model in CBAC 52 4.2.3 The Optimization of Probability Estimation Model in CBAC 53 4.2.4 Experiment Result 56 4.3 Rate Estimation 58 4.3.1 Rate Estimation Model 58 4.3.2 Experiment Result 61 4.4 Conclusion 63 Chapter 5 Implementation of Binary Arithmetic Decoder in CBAC 64 5.1 Architecture of BAD 65 5.1.1 Top Architecture of BAD 66 5.1.2 Range Update Module 67 5.1.3 Offset Update Module 69 5.1.4 Bits Read Module 73 5.1.5 Context Modeling 74 5.2 Complexity of BAD 76 5.3 Conclusion 77 Chapter 6 Conclusion and Further Work 79 6.1 Conclusion 79 6.2 Future Works 80 Reference 82 Appendix 87 A.1. Co-simulation Environment 87 A.1.1 Range Update Module (dRangeUpdate.v) 87 A.1.2 Offset Update Module(dOffsetUpdate.v) 102 A.1.3 Bits Read Module (dReadBits.v) 107 A.1.4 Binary Arithmetic Decoding Top Module (BADTop.v) 115 A.1.5 Test Bench 117Maste

    Image and Video Coding Techniques for Ultra-low Latency

    Get PDF
    The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe

    MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

    Full text link
    The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric
    corecore