7 research outputs found

    CABAC accelerator architectures for video compression in future multimedida : a survey

    Get PDF
    The demands for high quality, real-time performance and multi-format video support in consumer multimedia products are ever increasing. In particular, the future multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. The H.264/AVC video coding algorithms provide high enough compression efficiency to be utilized in these systems, and multimedia processors are able to provide the required adaptability, but the algorithms complexity demands for more efficient computing platforms. Heterogeneous (re-)configurable systems composed of multimedia processors and hardware accelerators constitute the main part of such platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of Main and High profiles of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed. The comparative analysis shows that the parallel pipeline accelerator architecture seems to be the most promising

    Survey of advanced CABAC accelarator architectures for future multimedia.

    Get PDF
    The future high quality multimedia systems require efficient video coding algorithms and corresponding adaptive high-performance computational platforms. In this paper, we survey the hardware accelerator architectures for Context-based Adaptive Binary Arithmetic Coding (CABAC) of H.264/AVC. The purpose of the survey is to deliver a critical insight in the proposed solutions, and this way facilitate further research on accelerator architectures, architecture development methods and supporting EDA tools. The architectures are analyzed, classified and compared based on the core hardware acceleration concepts, algorithmic characteristics, video resolution support and performance parameters, and some promising design directions are discussed

    Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

    Get PDF
    Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

    Feasibility Study of High-Level Synthesis : Implementation of a Real-Time HEVC Intra Encoder on FPGA

    Get PDF
    High-Level Synthesis (HLS) on automatisoitu suunnitteluprosessi, joka pyrkii parantamaan tuottavuutta perinteisiin suunnittelumenetelmiin verrattuna, nostamalla suunnittelun abstraktiota rekisterisiirtotasolta (RTL) kĂ€yttĂ€ytymistasolle. Erilaisia kaupallisia HLS-työkaluja on ollut markkinoilla aina 1990-luvulta lĂ€htien, mutta vasta Ă€skettĂ€in ne ovat alkaneet saada hyvĂ€ksyntÀÀ teollisuudessa sekĂ€ akateemisessa maailmassa. Hidas kĂ€yttöönottoaste on johtunut pÀÀasiassa huonommasta tulosten laadusta (QoR) kuin mitĂ€ on ollut mahdollista tavanomaisilla laitteistokuvauskielillĂ€ (HDL). Uusimmat HLS-työkalusukupolvet ovat kuitenkin kaventaneet QoR-aukkoa huomattavasti. TĂ€mĂ€ vĂ€itöskirja tutkii HLS:n soveltuvuutta videokoodekkien kehittĂ€miseen. Se esittelee useita HLS-toteutuksia High EïŹƒciency Video Coding (HEVC) -koodaukselle, joka on keskeinen mahdollistava tekniikka lukuisille nykyaikaisille mediasovelluksille. HEVC kaksinkertaistaa koodaustehokkuuden edeltĂ€jÀÀnsĂ€ Advanced Video Coding (AVC) -standardiin verrattuna, saavuttaen silti saman subjektiivisen visuaalisen laadun. TĂ€mĂ€ tyypillisesti saavutetaan huomattavalla laskennallisella lisĂ€kustannuksella. Siksi reaaliaikainen HEVC vaatii automatisoituja suunnittelumenetelmiĂ€, joita voidaan kĂ€yttÀÀ rautatoteutus- (HW ) ja varmennustyön minimoimiseen. TĂ€ssĂ€ vĂ€itöskirjassa ehdotetaan HLS:n kĂ€yttöÀ koko enkooderin suunnitteluprosessissa. DataintensiivisistĂ€ koodaustyökaluista, kuten intra-ennustus ja diskreetit muunnokset, myös enemmĂ€n kontrollia vaativiin kokonaisuuksiin, kuten entropiakoodaukseen. Avoimen lĂ€hdekoodin Kvazaar HEVC -enkooderin C-lĂ€hdekoodia hyödynnetÀÀn tĂ€ssĂ€ työssĂ€ referenssinĂ€ HLS-suunnittelulle sekĂ€ toteutuksen varmentamisessa. Suorituskykytulokset saadaan ja raportoidaan ohjelmoitavalla porttimatriisilla (FPGA). TĂ€mĂ€n vĂ€itöskirjan tĂ€rkein tuotos on HEVC intra enkooderin prototyyppi. Prototyyppi koostuu Nokia AirFrame Cloud Server palvelimesta, varustettuna kahdella 2.4 GHz:n 14-ytiminen Intel Xeon prosessorilla, sekĂ€ kahdesta Intel Arria 10 GX FPGA kiihdytinkortista, jotka voidaan kytkeĂ€ serveriin kĂ€yttĂ€en joko peripheral component interconnect express (PCIe) liitĂ€ntÀÀ tai 40 gigabitin EthernettiĂ€. PrototyyppijĂ€rjestelmĂ€ saavuttaa reaaliaikaisen 4K enkoodausnopeuden, jopa 120 kuvaa sekunnissa. LisĂ€ksi jĂ€rjestelmĂ€n suorituskykyĂ€ on helppo skaalata paremmaksi lisÀÀmĂ€llĂ€ jĂ€rjestelmÀÀn kĂ€ytĂ€nnössĂ€ minkĂ€ tahansa mÀÀrĂ€n verkkoon kytkettĂ€viĂ€ FPGA-kortteja. Monimutkaisen HEVC:n tehokas mallinnus ja sen monipuolisten ominaisuuksien mukauttaminen reaaliaikaiselle HW HEVC enkooderille ei ole triviaali tehtĂ€vĂ€, koska HW-toteutukset ovat perinteisesti erittĂ€in aikaa vieviĂ€. TĂ€mĂ€ vĂ€itöskirja osoittaa, ettĂ€ HLS:n avulla pystytÀÀn nopeuttamaan kehitysaikaa, tarjoamaan ennen nĂ€kemĂ€töntĂ€ suunnittelun skaalautuvuutta, ja silti osoittamaan kilpailukykyisiĂ€ QoR-arvoja ja absoluuttista suorituskykyĂ€ verrattuna olemassa oleviin toteutuksiin.High-Level Synthesis (HLS) is an automated design process that seeks to improve productivity over traditional design methods by increasing design abstraction from register transfer level (RTL) to behavioural level. Various commercial HLS tools have been available on the market since the 1990s, but only recently they have started to gain adoption across industry and academia. The slow adoption rate has mainly stemmed from lower quality of results (QoR) than obtained with conventional hardware description languages (HDLs). However, the latest HLS tool generations have substantially narrowed the QoR gap. This thesis studies the feasibility of HLS in video codec development. It introduces several HLS implementations for High EïŹƒciency Video Coding (HEVC) , that is the key enabling technology for numerous modern media applications. HEVC doubles the coding eïŹƒciency over its predecessor Advanced Video Coding (AVC) standard for the same subjective visual quality, but typically at the cost of considerably higher computational complexity. Therefore, real-time HEVC calls for automated design methodologies that can be used to minimize the HW implementation and veriïŹcation eïŹ€ort. This thesis proposes to use HLS throughout the whole encoder design process. From data-intensive coding tools, like intra prediction and discrete transforms, to more control-oriented tools, such as entropy coding. The C source code of the open-source Kvazaar HEVC encoder serves as a design entry point for the HLS ïŹ‚ow, and it is also utilized in design veriïŹcation. The performance results are gathered with and reported for ïŹeld programmable gate array (FPGA) . The main contribution of this thesis is an HEVC intra encoder prototype that is built on a Nokia AirFrame Cloud Server equipped with 2.4 GHz dual 14-core Intel Xeon processors and two Intel Arria 10 GX FPGA Development Kits, that can be connected to the server via peripheral component interconnect express (PCIe) generation 3 or 40 Gigabit Ethernet. The proof-of-concept system achieves real-time. 4K coding speed up to 120 fps, which can be further scaled up by adding practically any number of network-connected FPGA cards. Overcoming the complexity of HEVC and customizing its rich features for a real-time HEVC encoder implementation on hardware is not a trivial task, as hardware development has traditionally turned out to be very time-consuming. This thesis shows that HLS is able to boost the development time, provide previously unseen design scalability, and still result in competitive performance and QoR over state-of-the-art hardware implementations

    Parallelism and the software-hardware interface in embedded systems

    Get PDF
    This thesis by publications addresses issues in the architecture and microarchitecture of next generation, high performance streaming Systems-on-Chip through quantifying the most important forms of parallelism in current and emerging embedded system workloads. The work consists of three major research tracks, relating to data level parallelism, thread level parallelism and the software-hardware interface which together reflect the research interests of the author as they have been formed in the last nine years. Published works confirm that parallelism at the data level is widely accepted as the most important performance leverage for the efficient execution of embedded media and telecom applications and has been exploited via a number of approaches the most efficient being vectorlSIMD architectures. A further, complementary and substantial form of parallelism exists at the thread level but this has not been researched to the same extent in the context of embedded workloads. For the efficient execution of such applications, exploitation of both forms of parallelism is of paramount importance. This calls for a new architectural approach in the software-hardware interface as its rigidity, manifested in all desktop-based and the majority of embedded CPU's, directly affects the performance ofvectorized, threaded codes. The author advocates a holistic, mature approach where parallelism is extracted via automatic means while at the same time, the traditionally rigid hardware-software interface is optimized to match the temporal and spatial behaviour of the embedded workload. This ultimate goal calls for the precise study of these forms of parallelism for a number of applications executing on theoretical models such as instruction set simulators and parallel RAM machines as well as the development of highly parametric microarchitectural frameworks to encapSUlate that functionality.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    System-on-Chip design of a high performance low power full hardware cabac encoder in H.264/AVC

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Architectures for Adaptive Low-Power Embedded Multimedia Systems

    Get PDF
    This Ph.D. thesis describes novel hardware/software architectures for adaptive low-power embedded multimedia systems. Novel techniques for run-time adaptive energy management are proposed, such that both HW & SW adapt together to react to the unpredictable scenarios. A complete power-aware H.264 video encoder was developed. Comparison with state-of-the-art demonstrates significant energy savings while meeting the performance constraint and keeping the video quality degradation unnoticeable
    corecore