29 research outputs found

    Efficient HEVC-based video adaptation using transcoding

    Get PDF
    In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

    Error resilience and concealment techniques for high-efficiency video coding

    Get PDF
    This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the streaming stage, a prioritization algorithm, based on spatial dependencies, selects a reduced set of motion vectors to be transmitted, as side information, to reduce mismatched motion predictions at the decoder. The problem of error concealment-aware video coding is also investigated to enhance the overall error robustness. A new approach based on scalable coding and optimally error concealment selection is proposed, where the optimal error concealment modes are found by simulating transmission losses, followed by a saliency-weighted optimisation. Moreover, recovery residual information is encoded using a rate-controlled enhancement layer. Both are transmitted to the decoder to be used in case of data loss. Finally, an adaptive error resilience scheme is proposed to dynamically predict the video stream that achieves the highest decoded quality for a particular loss case. A neural network selects among the various video streams, encoded with different levels of compression efficiency and error protection, based on information from the video signal, the coded stream and the transmission network. Overall, the new robust video coding methods investigated in this thesis yield consistent quality gains in comparison with other existing methods and also the ones implemented in the HEVC reference software. Furthermore, the trade-off between coding efficiency and error robustness is also better in the proposed methods

    Perceptual Video Quality Assessment and Enhancement

    Get PDF
    With the rapid development of network visual communication technologies, digital video has become ubiquitous and indispensable in our everyday lives. Video acquisition, communication, and processing systems introduce various types of distortions, which may have major impact on perceived video quality by human observers. Effective and efficient objective video quality assessment (VQA) methods that can predict perceptual video quality are highly desirable in modern visual communication systems for performance evaluation, quality control and resource allocation purposes. Moreover, perceptual VQA measures may also be employed to optimize a wide variety of video processing algorithms and systems for best perceptual quality. This thesis exploits several novel ideas in the areas of video quality assessment and enhancement. Firstly, by considering a video signal as a 3D volume image, we propose a 3D structural similarity (SSIM) based full-reference (FR) VQA approach, which also incorporates local information content and local distortion-based pooling methods. Secondly, a reduced-reference (RR) VQA scheme is developed by tracing the evolvement of local phase structures over time in the complex wavelet domain. Furthermore, we propose a quality-aware video system which combines spatial and temporal quality measures with a robust video watermarking technique, such that RR-VQA can be performed without transmitting RR features via an ancillary lossless channel. Finally, a novel strategy for enhancing video denoising algorithms, namely poly-view fusion, is developed by examining a video sequence as a 3D volume image from multiple (front, side, top) views. This leads to significant and consistent gain in terms of both peak signal-to-noise ratio (PSNR) and SSIM performance, especially at high noise levels

    Feasibility Study of High-Level Synthesis : Implementation of a Real-Time HEVC Intra Encoder on FPGA

    Get PDF
    High-Level Synthesis (HLS) on automatisoitu suunnitteluprosessi, joka pyrkii parantamaan tuottavuutta perinteisiin suunnittelumenetelmiin verrattuna, nostamalla suunnittelun abstraktiota rekisterisiirtotasolta (RTL) käyttäytymistasolle. Erilaisia kaupallisia HLS-työkaluja on ollut markkinoilla aina 1990-luvulta lähtien, mutta vasta äskettäin ne ovat alkaneet saada hyväksyntää teollisuudessa sekä akateemisessa maailmassa. Hidas käyttöönottoaste on johtunut pääasiassa huonommasta tulosten laadusta (QoR) kuin mitä on ollut mahdollista tavanomaisilla laitteistokuvauskielillä (HDL). Uusimmat HLS-työkalusukupolvet ovat kuitenkin kaventaneet QoR-aukkoa huomattavasti. Tämä väitöskirja tutkii HLS:n soveltuvuutta videokoodekkien kehittämiseen. Se esittelee useita HLS-toteutuksia High Efficiency Video Coding (HEVC) -koodaukselle, joka on keskeinen mahdollistava tekniikka lukuisille nykyaikaisille mediasovelluksille. HEVC kaksinkertaistaa koodaustehokkuuden edeltäjäänsä Advanced Video Coding (AVC) -standardiin verrattuna, saavuttaen silti saman subjektiivisen visuaalisen laadun. Tämä tyypillisesti saavutetaan huomattavalla laskennallisella lisäkustannuksella. Siksi reaaliaikainen HEVC vaatii automatisoituja suunnittelumenetelmiä, joita voidaan käyttää rautatoteutus- (HW ) ja varmennustyön minimoimiseen. Tässä väitöskirjassa ehdotetaan HLS:n käyttöä koko enkooderin suunnitteluprosessissa. Dataintensiivisistä koodaustyökaluista, kuten intra-ennustus ja diskreetit muunnokset, myös enemmän kontrollia vaativiin kokonaisuuksiin, kuten entropiakoodaukseen. Avoimen lähdekoodin Kvazaar HEVC -enkooderin C-lähdekoodia hyödynnetään tässä työssä referenssinä HLS-suunnittelulle sekä toteutuksen varmentamisessa. Suorituskykytulokset saadaan ja raportoidaan ohjelmoitavalla porttimatriisilla (FPGA). Tämän väitöskirjan tärkein tuotos on HEVC intra enkooderin prototyyppi. Prototyyppi koostuu Nokia AirFrame Cloud Server palvelimesta, varustettuna kahdella 2.4 GHz:n 14-ytiminen Intel Xeon prosessorilla, sekä kahdesta Intel Arria 10 GX FPGA kiihdytinkortista, jotka voidaan kytkeä serveriin käyttäen joko peripheral component interconnect express (PCIe) liitäntää tai 40 gigabitin Ethernettiä. Prototyyppijärjestelmä saavuttaa reaaliaikaisen 4K enkoodausnopeuden, jopa 120 kuvaa sekunnissa. Lisäksi järjestelmän suorituskykyä on helppo skaalata paremmaksi lisäämällä järjestelmään käytännössä minkä tahansa määrän verkkoon kytkettäviä FPGA-kortteja. Monimutkaisen HEVC:n tehokas mallinnus ja sen monipuolisten ominaisuuksien mukauttaminen reaaliaikaiselle HW HEVC enkooderille ei ole triviaali tehtävä, koska HW-toteutukset ovat perinteisesti erittäin aikaa vieviä. Tämä väitöskirja osoittaa, että HLS:n avulla pystytään nopeuttamaan kehitysaikaa, tarjoamaan ennen näkemätöntä suunnittelun skaalautuvuutta, ja silti osoittamaan kilpailukykyisiä QoR-arvoja ja absoluuttista suorituskykyä verrattuna olemassa oleviin toteutuksiin.High-Level Synthesis (HLS) is an automated design process that seeks to improve productivity over traditional design methods by increasing design abstraction from register transfer level (RTL) to behavioural level. Various commercial HLS tools have been available on the market since the 1990s, but only recently they have started to gain adoption across industry and academia. The slow adoption rate has mainly stemmed from lower quality of results (QoR) than obtained with conventional hardware description languages (HDLs). However, the latest HLS tool generations have substantially narrowed the QoR gap. This thesis studies the feasibility of HLS in video codec development. It introduces several HLS implementations for High Efficiency Video Coding (HEVC) , that is the key enabling technology for numerous modern media applications. HEVC doubles the coding efficiency over its predecessor Advanced Video Coding (AVC) standard for the same subjective visual quality, but typically at the cost of considerably higher computational complexity. Therefore, real-time HEVC calls for automated design methodologies that can be used to minimize the HW implementation and verification effort. This thesis proposes to use HLS throughout the whole encoder design process. From data-intensive coding tools, like intra prediction and discrete transforms, to more control-oriented tools, such as entropy coding. The C source code of the open-source Kvazaar HEVC encoder serves as a design entry point for the HLS flow, and it is also utilized in design verification. The performance results are gathered with and reported for field programmable gate array (FPGA) . The main contribution of this thesis is an HEVC intra encoder prototype that is built on a Nokia AirFrame Cloud Server equipped with 2.4 GHz dual 14-core Intel Xeon processors and two Intel Arria 10 GX FPGA Development Kits, that can be connected to the server via peripheral component interconnect express (PCIe) generation 3 or 40 Gigabit Ethernet. The proof-of-concept system achieves real-time. 4K coding speed up to 120 fps, which can be further scaled up by adding practically any number of network-connected FPGA cards. Overcoming the complexity of HEVC and customizing its rich features for a real-time HEVC encoder implementation on hardware is not a trivial task, as hardware development has traditionally turned out to be very time-consuming. This thesis shows that HLS is able to boost the development time, provide previously unseen design scalability, and still result in competitive performance and QoR over state-of-the-art hardware implementations

    JTIT

    Get PDF
    kwartalni

    Towards Computational Efficiency of Next Generation Multimedia Systems

    Get PDF
    To address throughput demands of complex applications (like Multimedia), a next-generation system designer needs to co-design and co-optimize the hardware and software layers. Hardware/software knobs must be tuned in synergy to increase the throughput efficiency. This thesis provides such algorithmic and architectural solutions, while considering the new technology challenges (power-cap and memory aging). The goal is to maximize the throughput efficiency, under timing- and hardware-constraints

    Applications in Electronics Pervading Industry, Environment and Society

    Get PDF
    This book features the manuscripts accepted for the Special Issue “Applications in Electronics Pervading Industry, Environment and Society—Sensing Systems and Pervasive Intelligence” of the MDPI journal Sensors. Most of the papers come from a selection of the best papers of the 2019 edition of the “Applications in Electronics Pervading Industry, Environment and Society” (APPLEPIES) Conference, which was held in November 2019. All these papers have been significantly enhanced with novel experimental results. The papers give an overview of the trends in research and development activities concerning the pervasive application of electronics in industry, the environment, and society. The focus of these papers is on cyber physical systems (CPS), with research proposals for new sensor acquisition and ADC (analog to digital converter) methods, high-speed communication systems, cybersecurity, big data management, and data processing including emerging machine learning techniques. Physical implementation aspects are discussed as well as the trade-off found between functional performance and hardware/system costs

    Using Radio Frequency and Motion Sensing to Improve Camera Sensor Systems

    Get PDF
    Camera-based sensor systems have advanced significantly in recent years. This advancement is a combination of camera CMOS (complementary metal-oxide-semiconductor) hardware technology improvement and new computer vision (CV) algorithms that can better process the rich information captured. As the world becoming more connected and digitized through increased deployment of various sensors, cameras have become a cost-effective solution with the advantages of small sensor size, intuitive sensing results, rich visual information, and neural network-friendly. The increased deployment and advantages of camera-based sensor systems have fueled applications such as surveillance, object detection, person re-identification, scene reconstruction, visual tracking, pose estimation, and localization. However, camera-based sensor systems have fundamental limitations such as extreme power consumption, privacy-intrusive, and inability to see-through obstacles and other non-ideal visual conditions such as darkness, smoke, and fog. In this dissertation, we aim to improve the capability and performance of camera-based sensor systems by utilizing additional sensing modalities such as commodity WiFi and mmWave (millimeter wave) radios, and ultra-low-power and low-cost sensors such as inertial measurement units (IMU). In particular, we set out to study three problems: (1) power and storage consumption of continuous-vision wearable cameras, (2) human presence detection, localization, and re-identification in both indoor and outdoor spaces, and (3) augmenting the sensing capability of camera-based systems in non-ideal situations. We propose to use an ultra-low-power, low-cost IMU sensor, along with readily available camera information, to solve the first problem. WiFi devices will be utilized in the second problem, where our goal is to reduce the hardware deployment cost and leverage existing WiFi infrastructure as much as possible. Finally, we will use a low-cost, off-the-shelf mmWave radar to extend the sensing capability of a camera in non-ideal visual sensing situations.Doctor of Philosoph

    Schémas de tatouage d'images, schémas de tatouage conjoint à la compression, et schémas de dissimulation de données

    Get PDF
    In this manuscript we address data-hiding in images and videos. Specifically we address robust watermarking for images, robust watermarking jointly with compression, and finally non robust data-hiding.The first part of the manuscript deals with high-rate robust watermarking. After having briefly recalled the concept of informed watermarking, we study the two major watermarking families : trellis-based watermarking and quantized-based watermarking. We propose, firstly to reduce the computational complexity of the trellis-based watermarking, with a rotation based embedding, and secondly to introduce a trellis-based quantization in a watermarking system based on quantization.The second part of the manuscript addresses the problem of watermarking jointly with a JPEG2000 compression step or an H.264 compression step. The quantization step and the watermarking step are achieved simultaneously, so that these two steps do not fight against each other. Watermarking in JPEG2000 is achieved by using the trellis quantization from the part 2 of the standard. Watermarking in H.264 is performed on the fly, after the quantization stage, choosing the best prediction through the process of rate-distortion optimization. We also propose to integrate a Tardos code to build an application for traitors tracing.The last part of the manuscript describes the different mechanisms of color hiding in a grayscale image. We propose two approaches based on hiding a color palette in its index image. The first approach relies on the optimization of an energetic function to get a decomposition of the color image allowing an easy embedding. The second approach consists in quickly obtaining a color palette of larger size and then in embedding it in a reversible way.Dans ce manuscrit nous abordons l’insertion de données dans les images et les vidéos. Plus particulièrement nous traitons du tatouage robuste dans les images, du tatouage robuste conjointement à la compression et enfin de l’insertion de données (non robuste).La première partie du manuscrit traite du tatouage robuste à haute capacité. Après avoir brièvement rappelé le concept de tatouage informé, nous étudions les deux principales familles de tatouage : le tatouage basé treillis et le tatouage basé quantification. Nous proposons d’une part de réduire la complexité calculatoire du tatouage basé treillis par une approche d’insertion par rotation, ainsi que d’autre part d’introduire une approche par quantification basée treillis au seind’un système de tatouage basé quantification.La deuxième partie du manuscrit aborde la problématique de tatouage conjointement à la phase de compression par JPEG2000 ou par H.264. L’idée consiste à faire en même temps l’étape de quantification et l’étape de tatouage, de sorte que ces deux étapes ne « luttent pas » l’une contre l’autre. Le tatouage au sein de JPEG2000 est effectué en détournant l’utilisation de la quantification basée treillis de la partie 2 du standard. Le tatouage au sein de H.264 est effectué à la volée, après la phase de quantification, en choisissant la meilleure prédiction via le processus d’optimisation débit-distorsion. Nous proposons également d’intégrer un code de Tardos pour construire une application de traçage de traîtres.La dernière partie du manuscrit décrit les différents mécanismes de dissimulation d’une information couleur au sein d’une image en niveaux de gris. Nous proposons deux approches reposant sur la dissimulation d’une palette couleur dans son image d’index. La première approche consiste à modéliser le problème puis à l’optimiser afin d’avoir une bonne décomposition de l’image couleur ainsi qu’une insertion aisée. La seconde approche consiste à obtenir, de manière rapide et sûre, une palette de plus grande dimension puis à l’insérer de manière réversible
    corecore