Search CORE

10 research outputs found

Strategy of microscopic parallelism for Bitplane Image Coding

Author: Aulí-Llinàs Francesc
Blanes C. Ian
Enfedaque Pablo
Moure Juan
Sanchez Silva Victor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

Recent years have seen the upraising of a new type of processors strongly relying on the Single Instruction, Multiple Data (SIMD) architectural principle. The main idea behind SIMD computing is to apply a flow of instructions to multiple pieces of data in parallel and synchronously. This permits the execution of thousands of operations in parallel, achieving higher computational performance than with traditional Multiple Instruction, Multiple Data (MIMD) architectures. The level of parallelism required in SIMD computing can only be achieved in image coding systems via microscopic parallel strategies that code multiple coefficients in parallel. Until now, the only way to achieve microscopic parallelism in bitplane coding engines was by executing multiple coding passes in parallel. Such a strategy does not suit well SIMD computing because each thread executes different instructions. This paper introduces the first bitplane coding engine devised for the fine grain of parallelism required in SIMD computing. Its main insight is to allow parallel coefficient processing in a coding pass. Experimental tests show coding performance results similar to those of JPEG2000

Warwick Research Archives Portal Repository

H.264/AVC inter prediction on accelerator-based multi-core systems

Author: Claver José M.
Fernández Escribano Gerardo
Martínez José Luis
Rodríguez Sánchez Rafael
Sánchez José L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The AVC video coding standard adopts variable block sizes for inter frame coding to increase compression efficiency, among other new features. As a consequence of this, an AVC encoder has to employ a complex mode decision technique that requires high computational complexity. Several techniques aimed at accelerating the inter prediction process have been proposed in the literature in recent years. Recently, with the emergence of many-core processors or accelerators, a new way of supporting inter frame prediction has presented itself. In this paper, we present a step forward in the implementation of an AVC inter prediction algorithm in a graphics processing unit, using Compute Unified Device Architecture. The results show a negligible drop in rate distortion with a time reduction, on average, of over 98.8 % compared with full search and fast full search, and of over 80 % compared with UMHexagonS search

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Bitplane image coding with parallel coefficient processing

Author: Auli-Llinas Francesc
Enfedaque Pablo
Moure Juan C.
Sanchez Silva Victor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Image coding systems have been traditionally tailored for multiple instruction, multiple data (MIMD) computing. In general, they partition the (transformed) image in codeblocks that can be coded in the cores of MIMD-based processors. Each core executes a sequential flow of instructions to process the coefficients in the codeblock, independently and asynchronously from the others cores. Bitplane coding is a common strategy to code such data. Most of its mechanisms require sequential processing of the coefficients. The last years have seen the upraising of processing accelerators with enhanced computational performance and power efficiency whose architecture is mainly based on the single instruction, multiple data (SIMD) principle. SIMD computing refers to the execution of the same instruction to multiple data in a lockstep synchronous way. Unfortunately, current bitplane coding strategies cannot fully profit from such processors due to inherently sequential coding task. This paper presents bitplane image coding with parallel coefficient (BPC-PaCo) processing, a coding method that can process many coefficients within a codeblock in parallel and synchronously. To this end, the scanning order, the context formation, the probability model, and the arithmetic coder of the coding engine have been re-formulated. The experimental results suggest that the penalization in coding performance of BPC-PaCo with respect to the traditional strategies is almost negligible

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Warwick Research Archives Portal Repository

Diposit Digital de Documents de la UAB

FPGA-based DVCPRO HD Decoder Implementation Using Impulse C

Author: Cichon Slawomir
Gorgon Marek
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 01/01/2013
Field of study

To be completed

AGH (Akademia Górniczo-Hutnicza) University of Science and Technology: Journals

Computer Science Journal (AGH University of Science and Technology, Krakow)

Biblioteka Nauki - repozytorium artykuÅÃ³w

Directory of Open Access Journals

Parallel H.264/AVC Fast Rate-Distortion Optimized Motion Estimation using Graphics Processing Unit and Dedicated Hardware

Author: A. Ahmed
E. Magli
G. Masera
M. Martina
M.U. Shahid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Heterogeneous systems on a single chip composed of CPU, Graphical Processing Unit (GPU), and Field Programmable Gate Array (FPGA) are expected to emerge in near future. In this context, the System on Chip (SoC) can be dynamically adapted to employ different architectures for execution of data-intensive applications. Motion estimation is one such task that can be accelerated using FPGA and GPU for high performance H.264/AVC encoder implementation. In most of works on parallel implementation of motion estimation, the bit rate cost of motion vectors is generally ignored. On the contrary, this paper presents a fast rate-distortion optimized parallel motion estimation algorithm implemented on GPU using OpenCL and FPGA/ASIC using VHDL. The predicted motion vectors are estimated from temporally preceding motion vectors and used for evaluating the bit rate cost of the motion vectors simultaneously. The experimental results show that the proposed scheme achieves significant speedup on GPU and FPGA, and has comparable ratedistortion performance with respect to sequential fast motion estimation algorithm

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Strategy of Microscopic Parallelism for Bitplane Image Coding

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Multi-prediction particle filter for efficient parallelized implementation

Author: AC Sankaranarayanan
AC Sankaranarayanan
B Ristic
BB Manjunath
CH Chao
E Lindholm
F Evennou
L Miao
M Bolić
M Bolić
MD Bisceglie
MD Hill
MS Arulampalam
N Gordon
NM Cheung
O Capp'e
R Shams
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bitplane Image Coding With Parallel Coefficient Processing

Author: Francesc Auli-Llinas
Juan C. Moure
Pablo Enfedaque
Victor Sanchez
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Video Coding on Multicore Graphics Processors

Author: Au Oscar C.
Cheung Ngai-Man
Fan Xiaopeng
Kung Man-Cheung
Publication venue
Publication date: 01/01/2010
Field of study

In this article, we investigate using multi-core graphics processing units (GPUs) for video encoding and decoding. After an overview of video coding and GPUs, we review some previous work on structuring video coding modules so that the massive parallel processing capability of GPUs can be harnessed. We also review previous work on partitioning the video decoding flow between the central processing unit (CPU) and GPU. After that, we discuss in detail a GPU based fast motion estimation to illustrate some design considerations in using GPUs for video coding, and the tradeoff between speedup and rate-distortion performance. Our results highlight the importance to expose as much data parallelism as possible in designing algorithms for GPUs. © 2006 IEEE

Hong Kong University of Science and Technology Institutional Repository

Video Coding on Multicore Graphics Processors

Author: Man-Cheung Kung
Nagai-Man Cheung
Oscar Au
Xiaopeng Fan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref