Search CORE

6 research outputs found

Architecture exploration and VLSI design of multi-symbol arithmetic encoders for the AV1 coding format

Author: Bitencourt Tulio Pereira
Publication venue
Publication date: 01/01/2023
Field of study

To reduce the impact of videos in the global Internet capacity, companies rely upon video coding standards and formats, also known as codecs, to reduce the overall sizes of videos before transmitting or storing them. AV1, which arises as a promising state-of-the-art and royalties-free video coding format first released in 2018, aims to reduce the sizes of videos by applying novel techniques to boost AV1’s compression results. Amongst its core components, AV1 comprises an entropy coding block, which is re sponsible for losslessly encoding symbols generated by other core modules (e.g., intra prediction, motion compensation, etc.). The arithmetic encoder, which is part of the en tropy encoder, is a bottleneck due to its difficulty to work with parallelizations, and relies upon two primary operations: CDF Operation and Boolean Operation, where CDF stands for Cumulative Distribution Function. This thesis proposes a baseline VLSI design, which was named AE-AV1, as the first ever AV1 arithmetic encoder found in the literature, and capable of reaching ultra-high performance (i.e., processing of 8K@120fps videos in real-time). Moreover, additional versions of this architecture were proposed as AE-AV1-LP and AE-AV1-MB, which are, respectively, a low-power version and a novel design applying a Multi-Boolean technique also introduced in this thesis. All the herein proposed designs were synthesized using the Cadence™ RC tool and the ST 65nm PDK. As the AV1 is well-known for being an open-source alternative in the video coding industry, the AE-AV1 architecture was also synthesized from Verilog to GDSII layout using a fully open-source ASIC flow (i.e., OpenROAD tool, OpenLane flow, and ASAP7 and SkyWater 130nm PDKs). The architectures were capable of reaching frequencies of 581 MHz, 563 MHz and 590 MHz for the versions AE-AV1, AE-AV1-LP and AE-AV1-MB 2-bool, respectively. With regard to throughput rates, all herein introduced architectures are capable of reaching 8K@120fps real-time video processing with rates of 1.032 Gbits/sec, 0.999 Gbits/sec and 1.117 Gbits/sec respectively.Para reduzir o impacto dos vídeos na capacidade global de Internet, as empresas contam com padrões e formatos de codificação de vídeo, também conhecidos como codecs, para reduzir os tamanhos dos vídeos antes de transmiti-los ou armazená-los. O AV1, que surge como um promissor formato de codificação de vídeo de última geração e livre de royal ties lançado pela primeira vez em 2018, visa reduzir os tamanhos dos vídeos aplicando técnicas inovadoras e aprimoradas para aumentar os resultados de compactação do AV1. Entre seus componentes principais, o AV1 compreende um bloco de codificação de en tropia, que é responsável pela codificação sem perdas de símbolos gerados por outros módulos (por exemplo, predição intra-quadro, compensação de movimento, etc.). O co dificador aritmético, que faz parte do codificador de entropia, é um gargalo devido à sua dificuldade em trabalhar com paralelizações e conta com duas operações principais: CDF Operation e Boolean Operation, onde CDF representa Cumulative Distribution Function. Esta dissertação propõe um projeto VLSI digital, nomeado AE-AV1, como o primeiro codificador aritmético AV1 encontrado na literatura e capaz de atingir desempenho ultra high (ou seja, processamento de vídeos 8K@120fps em tempo real). Além disso, ver sões adicionais desta arquitetura foram propostas como AE-AV1-LP e AE-AV1-MB, que são, respectivamente, uma versão de baixo consumo (low-power) e um design inovador aplicando uma técnica Multi-Boolean também introduzida nesta dissertação. Todos os projetos aqui propostos foram sintetizados usando a ferramenta Cadence™ RC e o PDK ST 65nm. Como o AV1 é conhecido por ser uma alternativa de código aberto na indús tria de codificação de vídeo, a arquitetura AE-AV1 também foi sintetizada de Verilog a layout GDSII usando um fluxo ASIC totalmente de código aberto (ou seja, ferramenta OpenROAD, fluxo OpenLane e PDKs ASAP7 e SkyWater 130nm). As arquiteturas foram capazes de atingir frequências de 581 MHz, 563 MHz e 590 MHz nas versões AE-AV1, AE-AV1-LP e AE-AV1-MB 2-bool, respectivamente. Com relação às vazões, todas as arquiteturas são capazes de processar vídeos 8K@120fps em tempo real com taxas de 1.032 Gbits/seg, 0.999 Gbits/seg e 1.117 Gbits/seg respectivamente

Lume 5.8

Studies on Qualities of Video Streaming using MPEG-DASH in an 8K Broadcasting Content

Author: 原田臨太朗
Publication venue: 甲藤, 二郎
Publication date: 30/01/2017
Field of study

Waseda University Repository

Perceptual video quality assessment: the journey continues!

Author: Alan C. Bovik
Avinab Saha
Bowen Chen
Hakan Emre Gedik
Ramit Pahwa
Sai Karthikey Pentapati
Sandeep Mishra
Zaixi Shang
Publication venue: 'Frontiers Media SA'
Publication date: 01/06/2023
Field of study

Perceptual Video Quality Assessment (VQA) is one of the most fundamental and challenging problems in the field of Video Engineering. Along with video compression, it has become one of two dominant theoretical and algorithmic technologies in television streaming and social media. Over the last 2 decades, the volume of video traffic over the internet has grown exponentially, powered by rapid advancements in cloud services, faster video compression technologies, and increased access to high-speed, low-latency wireless internet connectivity. This has given rise to issues related to delivering extraordinary volumes of picture and video data to an increasingly sophisticated and demanding global audience. Consequently, developing algorithms to measure the quality of pictures and videos as perceived by humans has become increasingly critical since these algorithms can be used to perceptually optimize trade-offs between quality and bandwidth consumption. VQA models have evolved from algorithms developed for generic 2D videos to specialized algorithms explicitly designed for on-demand video streaming, user-generated content (UGC), virtual and augmented reality (VR and AR), cloud gaming, high dynamic range (HDR), and high frame rate (HFR) scenarios. Along the way, we also describe the advancement in algorithm design, beginning with traditional hand-crafted feature-based methods and finishing with current deep-learning models powering accurate VQA algorithms. We also discuss the evolution of Subjective Video Quality databases containing videos and human-annotated quality scores, which are the necessary tools to create, test, compare, and benchmark VQA algorithms. To finish, we discuss emerging trends in VQA algorithm design and general perspectives on the evolution of Video Quality Assessment in the foreseeable future

Directory of Open Access Journals

Design and Implementation of IDCT/IDST-Specific Accelerators for HEVC Standard on Heterogeneous Accelerator-Rich Platform

Author: Pourabed Mohammad Ali
Publication venue
Publication date: 08/05/2019
Field of study

Having High Efficiency Video Coding (HEVC) is important for image processing, reducing bandwidth, and increasing video quality. There are different methods that can be used to implement HEVC. This thesis focuses on design and implementation of application-specific accelerators for IDCT/IDST algorithms dedicated for HEVC standard. Those algorithms are parallel-in-nature tasks which makes them suitable to be executed by heterogeneous multicore platforms. This is done using accelerators which are required for power efficient processing. In this study, Coarse-Grained Reconfigurable Arrays (CGRAs) are used for making a template for an accelerator. CGRA has one of the major roles in a Heterogeneous Accelerator-Rich Platforms (HARP) as it is capable of accelerating non-parallel loops with lower loop counts. This thesis includes various algorithms for the use of IDCT and IDST with different designs and templates, reaching a unique final architecture. The final output intended is to reach 4 points IDST together with a 4/8 points IDCT. Another feature added to the hypothesis is the use of different dimensions for the CGRA template in order to have a different type of accelerator. The many CGRAs are combined together in successive arrangement with Reduced Instructions Set Computers (RISC) over the Network-on-Chip (NoC). The aim is to study the performance of the accelerator used for the IDCT and the IDST. This can be evaluated as the data movement through NoC network along with comparison of performance of accelerator with clock cycles in order to calculate the efficiency of the system. The results show that a four point IDST and IDCT can be computed in 56 clock cycles. In addition, the 8 point IDCT can be implemented in 64 cycles. One important factor to consider during the study is the power and energy consumption which is important in this century. The dynamic power dissipation usage for the routing of data has reached a value of 4.03 mW. Whereas, the energy consumption was 1.76

\mu

J for the 4 points system (IDCT and IDST) and 3.06

\mu

J for the 8 points (IDCT). Processing Elements (PEs) are used for implementing the transform algorithm and units were operated at 200 MHz. Finally, these results show that 1080P image at 30 frames per second can be attained by using FPGA

Trepo - Institutional Repository of Tampere University

Quality of Experience in Immersive Video Technologies

Author: Hanhart Philippe
Publication venue: Lausanne, EPFL
Publication date: 06/04/2016
Field of study

Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

Infoscience - École polytechnique fédérale de Lausanne

Bitstream-based video quality modeling and analysis of HTTP-based adaptive streaming

Author: Ramachandra Rao Rakesh Rao
Publication venue
Publication date: 01/01/2023
Field of study

Die Verbreitung erschwinglicher Videoaufnahmetechnologie und verbesserte Internetbandbreiten ermöglichen das Streaming von hochwertigen Videos (Auflösungen > 1080p, Bildwiederholraten ≥ 60fps) online. HTTP-basiertes adaptives Streaming ist die bevorzugte Methode zum Streamen von Videos, bei der Videoparameter an die verfügbare Bandbreite angepasst wird, was sich auf die Videoqualität auswirkt. Adaptives Streaming reduziert Videowiedergabeunterbrechnungen aufgrund geringer Netzwerkbandbreite, wirken sich jedoch auf die wahrgenommene Qualität aus, weswegen eine systematische Bewertung dieser notwendig ist. Diese Bewertung erfolgt üblicherweise für kurze Abschnitte von wenige Sekunden und während einer Sitzung (bis zu mehreren Minuten). Diese Arbeit untersucht beide Aspekte mithilfe perzeptiver und instrumenteller Methoden. Die perzeptive Bewertung der kurzfristigen Videoqualität umfasst eine Reihe von Labortests, die in frei verfügbaren Datensätzen publiziert wurden. Die Qualität von längeren Sitzungen wurde in Labortests mit menschlichen Betrachtern bewertet, die reale Betrachtungsszenarien simulieren. Die Methodik wurde zusätzlich außerhalb des Labors für die Bewertung der kurzfristigen Videoqualität und der Gesamtqualität untersucht, um alternative Ansätze für die perzeptive Qualitätsbewertung zu erforschen. Die instrumentelle Qualitätsevaluierung wurde anhand von bitstrom- und hybriden pixelbasierten Videoqualitätsmodellen durchgeführt, die im Zuge dieser Arbeit entwickelt wurden. Dazu wurde die Modellreihe AVQBits entwickelt, die auf den Labortestergebnissen basieren. Es wurden vier verschiedene Modellvarianten von AVQBits mit verschiedenen Inputinformationen erstellt: Mode 3, Mode 1, Mode 0 und Hybrid Mode 0. Die Modellvarianten wurden untersucht und schneiden besser oder gleichwertig zu anderen aktuellen Modellen ab. Diese Modelle wurden auch auf 360°- und Gaming-Videos, HFR-Inhalte und Bilder angewendet. Darüber hinaus wird ein Langzeitintegrationsmodell (1 - 5 Minuten) auf der Grundlage des ITU-T-P.1203.3-Modells präsentiert, das die verschiedenen Varianten von AVQBits mit sekündigen Qualitätswerten als Videoqualitätskomponente des vorgeschlagenen Langzeitintegrationsmodells verwendet. Alle AVQBits-Varianten, das Langzeitintegrationsmodul und die perzeptiven Testdaten wurden frei zugänglich gemacht, um weitere Forschung zu ermöglichen.The pervasion of affordable capture technology and increased internet bandwidth allows high-quality videos (resolutions > 1080p, framerates ≥ 60fps) to be streamed online. HTTP-based adaptive streaming is the preferred method for streaming videos, adjusting video quality based on available bandwidth. Although adaptive streaming reduces the occurrences of video playout being stopped (called “stalling”) due to narrow network bandwidth, the automatic adaptation has an impact on the quality perceived by the user, which results in the need to systematically assess the perceived quality. Such an evaluation is usually done on a short-term (few seconds) and overall session basis (up to several minutes). In this thesis, both these aspects are assessed using subjective and instrumental methods. The subjective assessment of short-term video quality consists of a series of lab-based video quality tests that have resulted in publicly available datasets. The overall integral quality was subjectively assessed in lab tests with human viewers mimicking a real-life viewing scenario. In addition to the lab tests, the out-of-the-lab test method was investigated for both short-term video quality and overall session quality assessment to explore the possibility of alternative approaches for subjective quality assessment. The instrumental method of quality evaluation was addressed in terms of bitstream- and hybrid pixel-based video quality models developed as part of this thesis. For this, a family of models, namely AVQBits has been conceived using the results of the lab tests as ground truth. Based on the available input information, four different instances of AVQBits, that is, a Mode 3, a Mode 1, a Mode 0, and a Hybrid Mode 0 model are presented. The model instances have been evaluated and they perform better or on par with other state-of-the-art models. These models have further been applied to 360° and gaming videos, HFR content, and images. Also, a long-term integration (1 - 5 mins) model based on the ITU-T P.1203.3 model is presented. In this work, the different instances of AVQBits with the per-1-sec scores output are employed as the video quality component of the proposed long-term integration model. All AVQBits variants as well as the long-term integration module and the subjective test data are made publicly available for further research

Digitale Bibliothek Thüringen