5 research outputs found

    A Deeply Pipelined CABAC Decoder for HEVC Supporting Level 6.2 High-tier Applications

    Get PDF
    High Efficiency Video Coding (HEVC) is the latest video coding standard that specifies video resolutions up to 8K Ultra-HD (UHD) at 120 fps to support the next decade of video applications. This results in high-throughput requirements for the context adaptive binary arithmetic coding (CABAC) entropy decoder, which was already a well-known bottleneck in H.264/AVC. To address the throughput challenges, several modifications were made to CABAC during the standardization of HEVC. This work leverages these improvements in the design of a high-throughput HEVC CABAC decoder. It also supports the high-level parallel processing tools introduced by HEVC, including tile and wavefront parallel processing. The proposed design uses a deeply pipelined architecture to achieve a high clock rate. Additional techniques such as the state prefetch logic, latched-based context memory, and separate finite state machines are applied to minimize stall cycles, while multibypass- bin decoding is used to further increase the throughput. The design is implemented in an IBM 45nm SOI process. After place-and-route, its operating frequency reaches 1.6 GHz. The corresponding throughputs achieve up to 1696 and 2314 Mbin/s under common and theoretical worst-case test conditions, respectively. The results show that the design is sufficient to decode in real-time high-tier video bitstreams at level 6.2 (8K UHD at 120 fps), or main-tier bitstreams at level 5.1 (4K UHD at 60 fps) for applications requiring sub-frame latency, such as video conferencing

    Application-Specific Cache and Prefetching for HEVC CABAC Decoding

    Get PDF
    Context-based Adaptive Binary Arithmetic Coding (CABAC) is the entropy coding module in the HEVC/H.265 video coding standard. As in its predecessor, H.264/AVC, CABAC is a well-known throughput bottleneck due to its strong data dependencies. Besides other optimizations, the replacement of the context model memory by a smaller cache has been proposed for hardware decoders, resulting in an improved clock frequency. However, the effect of potential cache misses has not been properly evaluated. This work fills the gap by performing an extensive evaluation of different cache configurations. Furthermore, it demonstrates that application-specific context model prefetching can effectively reduce the miss rate and increase the overall performance. The best results are achieved with two cache lines consisting of four or eight context models. The 2 ร— 8 cache allows a performance improvement of 13.2 percent to 16.7 percent compared to a non-cached decoder due to a 17 percent higher clock frequency and highly effective prefetching. The proposed HEVC/H.265 CABAC decoder allows the decoding of high-quality Full HD videos in real-time using few hardware resources on a low-power FPGA.EC/H2020/645500/EU/Improving European VoD Creative Industry with High Efficiency Video Delivery/Film26

    Optimizing HEVC CABAC decoding with a context model cache and application-specific prefetching

    Get PDF
    Context-based Adaptive Binary Arithmetic Coding is the entropy coding module in the most recent JCT-VC video coding standard HEVC/H.265. As in the predecessor H.264/AVC, CABAC is a well-known throughput bottleneck due to its strong data dependencies. Beside other optimizations, the replacement of the context model memory by a smaller cache has been proposed, resulting in an improved clock frequency. However, the effect of potential cache misses has not been properly evaluated. Our work fills this gap and performs an extensive evaluation of different cache configurations. Furthermore, it is demonstrated that application-specific context model prefetching can effectively reduce the miss rate and make it negligible. Best overall performance results were achieved with caches of two and four lines, where each cache line consists of four context models. Four cache lines allow a speed-up of 10% to 12% for all video configurations while two cache lines improve the throughput by 9% to 15% for high bitrate videos and by 1% to 4% for low bitrate videos.EC/H2020/645500/EU/Improving European VoD Creative Industry with High Efficiency Video Delivery/Film26

    The Optimization of Context-based Binary Arithmetic Coding in AVS2.0

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐ์ •๋ณด๊ณตํ•™๋ถ€, 2016. 2. ์ฑ„์ˆ˜์ต.HEVC(High Efficiency Video Coding)๋Š” ์ง€๋‚œ ์ œ๋„ˆ๋ ˆ์ด์…˜ ํ‘œ์ค€ H.264/AVC๋ณด๋‹ค ์ฝ”๋”ฉ ํšจ์œจ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ๋ฅผ ์œ„ํ•ด์„œ ๊ตญ์ œ ํ‘œ์ค€ ์กฐ์ง๊ณผ(International Standard Organization) ๊ตญ์ œ ์ „๊ธฐ ํ†ต์‹  ์—ฐํ•ฉ(International Telecommunication Union)์— ์˜ํ•ด ๊ณต๋™์œผ๋กœ ๊ฐœ๋ฐœ๋œ ๊ฒƒ์ด๋‹ค. ์ค‘๊ตญ ์ž‘์—… ๊ทธ๋ฃน์ธ AVS(Audio and Video coding standard)๊ฐ€ ์ด๋ฏธ ๋น„์Šทํ•œ ๋…ธ๋ ฅ์„ ๋ฐ”์ณค๋‹ค. ๊ทธ๋“ค์ด ๋งŽ์ด ์ฐฝ์˜์ ์ธ ์ฝ”๋”ฉ ๋„๊ตฌ๋ฅผ ์šด์šฉํ•œ ์ฒซ ์ œ๋„ˆ๋ ˆ์ด์…˜ AVS1์˜ ์••์ถ• ํผํฌ๋จผ์Šค๋ฅผ ๋†’์ด๋„๋ก ์ตœ์‹ ์˜ ์ฝ”๋”ฉ ํ‘œ์ค€(AVS2 or AVS2.0)์„ ๊ฐœ๋ฐœํ–ˆ๋‹ค. AVS2.0 ์ค‘์— ์—”ํŠธ๋กœํ”ผ ์ฝ”๋”ฉ ๋„๊ตฌ๋กœ ์‚ฌ์šฉ๋œ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ์ฝ”๋”ฉ(CBAC)์€ ์ „์ฒด์  ์ฝ”๋”ฉ ํ‘œ์ค€ ์ค‘์—์„œ ์ค‘์š”ํ•œ ์—ญํ•˜๋ฅผ ํ–ˆ๋‹ค. HEVC์—์„œ ์ฑ„์šฉ๋œ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ ์กฐ์ •์˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ์ฝ”๋”ฉ(CABAC)๊ณผ ๋น„์Šทํ•˜๊ฒŒ ์ด ๋‘ ์ฝ”๋”ฉ์€ ๋‹ค ์Šน์ˆ˜ ์ž์œ  ๋ฐฉ๋ฒ•์„ ์ฑ„์šฉํ•ด์„œ ๊ณ„์‚ฐ ์ฝ”๋”ฉ์„ ํ˜„์‹คํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๊ฐ ์ฝ”๋”ฉ๋งˆ๋‹ค ๊ฐ์ž์˜ ํŠน์ •ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๊ณฑ์…ˆ ๋ฌธ์ œ๋ฅผ ์ฒ˜๋ฆฌํ•œ ๊ฒƒ์ด๋‹ค. ๋ณธ์ง€๋Š” AVS2.0์ค‘์˜ CBAC์— ๋Œ€ํ•œ ๋” ๊นŠ์ด ์ดํ•ด์™€ ๋” ์ข‹์€ ํผํฌ๋จผ์Šค ๊ฐœ์„ ์˜ ๋ชฉ์ ์œผ๋กœ 3๊ฐ€์ง€ ์ธก๋ฉด์˜ ์ผ์„ ํ•œ๋‹ค. ์ฒซ์งธ, ์šฐ๋ฆฌ๊ฐ€ ํ•œ ๋น„๊ต ์ œ๋„๋ฅผ ๋‹ค์ž์ธ์„ ํ•ด์„œ AVS2.0ํ”Œ๋žซํผ ์ค‘์˜ CBAC์™€ CABAC๋ฅผ ๋น„๊ตํ–ˆ๋‹ค. ๋‹ค๋ฅธ ์‹คํ–‰ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ๊ณ ๋ คํ•˜์—ฌ HEVC์ค‘์˜ CABAC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ AVS2.0์— ์ด์‹ํ•œ๋‹ค.์˜ˆ๋ฅผ ๋“ค๋ฉด, ์ƒํ™ฉ ๊ธฐ๋ฐ˜ ์ดˆ๊ธฐ์น˜๊ฐ€ ๋‹ค๋ฅด๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” CBAC๊ฐ€ ๋” ์ข‹์€ ์ฝ”๋”ฉ ํผํฌ๋จผ์Šค๋ฅผ ๋‹ฌ์„ฑํ•œ๋‹ค๊ณ  ์•Œ๋ ค์ง„๋‹ค. ๊ทธ ๋‹ค์Œ์— CBAC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ตœ์ ํ™”์‹œํ‚ค๊ธฐ๋ฅผ ์œ„ํ•ด์„œ ๋ช‡ ๊ฐ€์ง€ ์•„์ด๋””์–ด๋ฅผ ์ œ์•ˆํ•˜๊ฒŒ ๋๋‹ค. ์ฝ”๋”ฉ ํผํฌ๋จผ์Šค ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ์˜ ๋ชฉ์ ์œผ๋กœ ๊ทผ์‚ฌ ์˜ค์ฐจ ๋ณด์ƒ(approximation error compensation)๊ณผ ํ™•๋ฅ  ์ถ”์ • ์ตœ์ ํ™”(probability estimation)๋ฅผ ๋„์ž…ํ–ˆ๋‹ค. ๋‘ ์ฝ”๋”ฉ์€ ๋‹ค๋ฅธ ์•ต์ปค๋ณด๋‹ค ๋‹ค ๋ถ€ํ˜ธํ™”ํšจ์œจ ํ–ฅ์ƒ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ฒŒ ๋๋‹ค. ๋‹ค๋ฅธ ํ•œํŽธ์œผ๋กœ๋Š” ์ฝ”๋”ฉ ์‹œ๊ฐ„์„ ์ค„์ด๊ธฐ๋ฅผ ์œ„ํ•˜์—ฌ ๋ ˆํ…Œ ์ถ”์ • ๋ชจ๋ธ(rate estimation model)๋„ ์ œ์•ˆํ•˜๊ฒŒ ๋œ๋‹ค. ๋ถ€ํ˜ธ์œจ-๋ณ€ํ˜• ์ตœ์ ํ™” ๊ณผ์ •(Rate-Distortion Optimization process)์˜ ๋ถ€ํ˜ธ์œจ-๋ณ€ํ˜• ๋Œ€๊ฐ€ ๊ณ„์‚ฐ(Rate-distortion cost calculation)์„ ์ง€์ง€ํ•˜๋„๋ก ๋ฆฌ์–ผ CBAC ์•Œ๊ณ ๋ฆฌ์ฆ˜(real CBAC algorithm) ๋ ˆํ…Œ ์ถ”์ •(rate estimation)์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”(decoder) ์‹คํ–‰ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์„œ์ˆ ํ–ˆ๋‹ค. AVS2.0 ์ค‘์˜ ์ƒํ™ฉ ๊ธฐ๋ฐ˜ 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”ฉ(CBAD)์ด ๋„ˆ๋ฌด ๋งŽ์ด ๋ฐ์ดํ„ฐ ์ข…์†์„ฑ๊ณผ ๊ณ„์‚ฐ ๋ถ€๋‹ด์„ ๋„์ž…ํ•˜๊ธฐ ๋•Œ๋ฌธ์— 2๊ฐœ ํ˜น์€ 2๊ฐœ ์ด์ƒ์˜ bin ํ‰ํ–‰ ๋””์ฝ”๋”ฉ์ธ ์ฒ˜๋ฆฌ๋Ÿ‰(CBAD)์„ ๋””์ž์ธ์„ ํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. 2์ง„๋ฒ• ๊ณ„์‚ฐ ๋””์ฝ”๋”ฉ์˜ one-bin ์ œ๋„๋„ ์—ฌ๊ธฐ์„œ ๋””์ž์ธ์„ ํ•˜๊ฒŒ ๋๋‹ค. ํ˜„์žฌ๊นŒ์ง€ AVS์˜ CBAD ๊ธฐ์กด ๋””์ž์ธ์ด ์—†๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์šฐ๋ฆฌ์˜ ๋‹ค์ž์ธ์„ ๊ด€๋ จ๋œ HEVC์˜ ์—ฐ๊ตฌ์™€ ๋น„๊ตํ•˜์—ฌ ์„ค๋“๋ ฅ์ด ๊ฐ•ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ๋‹ค.High Efficiency Video Coding (HEVC) was jointly developed by the International Standard Organization (ISO) and International Telecommunication Union (ITU) to improve the coding efficiency further compared with last generation standard H.264/AVC. The similar efforts have been devoted by the Audio and Video coding Standard (AVS) Workgroup of China. They developed the newest video coding standard (AVS2 or AVS2.0) in order to enhance the compression performance of the first generation AVS1 with many novel coding tools. The Context-based Binary Arithmetic Coding (CBAC) as the entropy coding tool used in the AVS2.0 plays a vital role in the overall coding standard. Similar with Context-based Adaptive Binary Arithmetic Coding (CABAC) adopted by HEVC, both of them employ the multiplier-free method to realize the arithmetic coding procedure. However, each of them develops the respective specific algorithm to deal with multiplication problem. In this work, there are three aspects work we have done in order to understand CBAC in AVS2.0 better and try to explore more performance improvement. Firstly, we design a comparison scheme to compare the CBAC and CABAC in the AVS2.0 platform. The CABAC algorithm in HEVC was transplanted into AVS2.0 with consideration about the different implementation detail. For example, the context initialization. The experiment result shows that the CBAC achieves better coding performance. Then several ideas to optimize the CBAC algorithm in AVS2.0 were proposed. For coding performance improvement, the proposed approximation error compensation and probability estimation optimization were introduced. Both of these two coding tools obtain coding efficiency improvement compared with the anchor. In the other aspect, the rate estimation model was proposed to reduce the coding time. Using rate estimation instead of the real CBAC algorithm to support the Rate-distortion cost calculation in Rate-Distortion Optimization (RDO) process, can significantly save the coding time due to the computation complexity of CBAC in nature. Lastly, the binary arithmetic decoder implementation detail was described. Since Context-based Binary Arithmetic Decoding (CBAD) in AVS2.0 introduces too much strong data dependence and computation burden, it is difficult to design a high throughput CBAD with 2 bins or more decoded in parallel. Currently, one-bin scheme of binary arithmetic decoder was designed in this work. Even through there is no previous design for CBAD of AVS up to now, we compare our design with other relative works for HEVC, and our design achieves a compelling experiment result.Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Key Techniques in AVS2.0 3 1.3 Research Contents 9 1.3.1 Performance Comparison of CBAC 9 1.3.2 CBAC Performance Improvement 10 1.3.3 Implementation of Binary Arithmetic Decoder in CBAC 12 1.4 Organization 12 Chapter 2 Entropy Coder CBAC in AVS2.0 14 2.1 Introduction of Entropy Coding 14 2.2 CBAC Overview 16 2.2.1 Binarization and Generation of Bin String 17 2.2.2 Context Modeling and Probability Estimation 19 2.2.3 Binary Arithmetic Coding Engine 22 2.3 Two-level Scan Coding CBAC in AVS2.0 26 2.3.1 Scan order 28 2.3.2 First level coding 30 2.3.3 Second level coding 31 2.4 Summary 32 Chapter 3 Performance Comparison in CBAC 34 3.1 Differences between CBAC and CABAC 34 3.2 Comparison of Two BAC Engines 36 3.2.1 Statistics and initialization of Context Models 37 3.2.2 Adaptive Initialization Probability 40 3.3 Experiment Result 41 3.4 Conclusion 42 Chapter 4 CBAC Performance Improvement 43 4.1 Approximation Error Compensation 43 4.1.1 Error Compensation Table 43 4.1.2 Experiment Result 48 4.2 Probability Estimation Model Optimization 48 4.2.1 Probability Estimation 48 4.2.2 Probability Estimation Model in CBAC 52 4.2.3 The Optimization of Probability Estimation Model in CBAC 53 4.2.4 Experiment Result 56 4.3 Rate Estimation 58 4.3.1 Rate Estimation Model 58 4.3.2 Experiment Result 61 4.4 Conclusion 63 Chapter 5 Implementation of Binary Arithmetic Decoder in CBAC 64 5.1 Architecture of BAD 65 5.1.1 Top Architecture of BAD 66 5.1.2 Range Update Module 67 5.1.3 Offset Update Module 69 5.1.4 Bits Read Module 73 5.1.5 Context Modeling 74 5.2 Complexity of BAD 76 5.3 Conclusion 77 Chapter 6 Conclusion and Further Work 79 6.1 Conclusion 79 6.2 Future Works 80 Reference 82 Appendix 87 A.1. Co-simulation Environment 87 A.1.1 Range Update Module (dRangeUpdate.v) 87 A.1.2 Offset Update Module(dOffsetUpdate.v) 102 A.1.3 Bits Read Module (dReadBits.v) 107 A.1.4 Binary Arithmetic Decoding Top Module (BADTop.v) 115 A.1.5 Test Bench 117Maste

    A Deeply Pipelined CABAC Decoder for HEVC Supporting Level 6.2 High-Tier Applications

    No full text
    corecore