Search CORE

397 research outputs found

Implementation of JPEG compression and motion estimation on FPGA hardware

Author: Gopalakrishnan Ramakrishna
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2008
Field of study

A hardware implementation of JPEG allows for real-time compression in data intensivve applications, such as high speed scanning, medical imaging and satellite image transmission. Implementation options include dedicated DSP or media processors, FPGA boards, and ASICs. Factors that affect the choice of platform selection involve cost, speed, memory, size, power consumption, and case of reconfiguration. The proposed hardware solution is based on a Very high speed integrated circuit Hardware Description Language (VHDL) implememtation of the codec with prefered realization using an FPGA board due to speed, cost and flexibility factors; The VHDL language is commonly used to model hardware impletations from a top down perspective. The VHDL code may be simulated to correct mistakes and subsequently synthesized into hardware using a synthesis tool, such as the xilinx ise suite. The same VHDL code may be synthesized into a number of sifferent hardware architetcures based on constraints given. For example speed was the major constraint when synthesizing the pipeline of jpeg encoding and decoding, while chip area and power consumption were primary constraints when synthesizing the on-die memory because of large area. Thus, there is a trade off between area and speed in logic synthesis

University of Nevada, Las Vegas Repository

Enhancement of Digital Photo Frame Capabilities With Dedicated Hardware

Author: Cheedella Phani Teja
Paulose Leo
Publication venue
Publication date: 14/05/2012
Field of study

Photo frames have come a long way since the typical ones that needed to have a photo printed and stuck on them. Today in this digital era we have a new concept, named digital photo frame, a modern representation of the conventional photo frame. A digital photo frame is basically a picture frame that displays photos without the need to print them. They are available in a variety of sizes and with varied configurations. A typical frame varies in size from 7 inches to 20 inches. There are also key chain sized frames available. These frames also support a variety of formats like .jpeg, .tiff, .bmp and so on. Most of the frames provide an option to run the photos in a sequential or random manner as a slideshow with an adjustable time interval. The mode of input of the photos to the frame is also multi-fold. It can be done directly via the memory card of the camera, or else various memory devices like USB drives, SD Cards, MMC Cards and so on can be used. Nowadays even Bluetooth technology is being used. Another option that is becoming quite popular is that, users can take their photos directly from the Internet from sites like Flickr, Picassa or from their e-mail. Also these frames generally come with built in speakers and with remote controls. Our initial objective was to decide on which all features can be added to the Digital Photo Frame that we design. For this purpose we conducted simulation exercises in MATLAB so as to prove its feasibility. This simulation exercise was divided into two parts. The first part was to perform compression and decompression and the second half dealt with the various enhancements that can be added to the frame. For our compression and decompression we considered the JPEG standard. Joint Photographic Experts Group - an ISO/ITU standard for compressing still images. The JPEG format is very popular due to its variable compression range. A few limitations of JPEG include the fact that it is lossy and also not great for displaying text. The common extension for it include *.jpg, *.jff, *.m-jpeg,*.mpeg The various enhancement features that we tested for feasibility include Mean Filter, Median Filter, Image Sharpening, Negative Image Extraction, Logarithmic Transformations, Power Law Correction (Gamma Correction), Contrast Stretching, Grey Level Slicing, Bit Plane Slicing, Laplace Filtering. We then proceeded onto the hardware implementation of the above said features. We only implemented a handful of features owing to the complexity of design and lack of time. We first implemented the Compression and Decompression algorithm. The two enhancement features we implemented were Laplace Filter and Median Filter. For our implementation we used the VIRTEX 2 FPGA Board

ethesis@nitr

Gbit/second lossless data compression hardware

Author: Jose L. Nunez-Yanez (7202684)
Publication venue
Publication date: 01/01/2001
Field of study

This thesis investigates how to improve the performance of lossless data compression hardware as a tool to reduce the cost per bit stored in a computer system or transmitted over a communication network. Lossless data compression allows the exact reconstruction of the original data after decompression. Its deployment in some high-bandwidth applications has been hampered due to performance limitations in the compressing hardware that needs to match the performance of the original system to avoid becoming a bottleneck. Advancing the area of lossless data compression hardware, hence, offers a valid motivation with the potential of doubling the performance of the system that incorporates it with minimum investment. This work starts by presenting an analysis of current compression methods with the objective of identifying the factors that limit performance and also the factors that increase it. [Continues.

Loughborough University Institutional Repository

VHDL modeling and synthesis of the JPEG-XR inverse transform

Author: Frandina Peter
Publication venue: RIT Scholar Works
Publication date: 01/08/2009
Field of study

This work presents a pipelined VHDL implementation of the inverse lapped biorthogonal transform used in the decompression process of the soon to be released JPEG-XR still image standard format. This inverse transform involves integer only calculations using lifting operations and Kronecker products. Divisions and multiplications by small integer coefficients are implemented using a bit shift and add technique resulting in a multiplier-less implementation with 736 instances of addition. When targeted to an Altera Stratix II FPGA with a 50 MHz system clock, this design is capable of completing the inverse transform of an 8400 x 6600 pixel image in less than 70 ms

RIT Scholar Works

Gigabyte per second streaming lossless data compression hardware based on a configurable variable-geometry CAM dictionary

Author: Chouliaras VA
Nunez-Yanez JL
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 10/01/2006
Field of study

Explore Bristol Research

VLSI design and implementation of adaptive two-dimensional multilayer neural network architecture for image compression and decompression

Author: Raj P. Cyril Prasanna
Publication venue
Publication date: 01/01/2010
Field of study

Coventry University Pure Portal

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Author: Aimar Alessandro
Calabrese Enrico
Corradi Federico
Delbruck Tobi
Linares-Barranco Alejandro
Liu Shih-Chii
Lungu Iulia-Alexandra
Milde Moritz B.
Mostafa Hesham
Rios-Navarro Antonio
Tapiador-Morales Ricardo
Publication venue
Publication date: 01/01/2017
Field of study

Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm

^2

. As further proof of NullHop's usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations

arXiv.org e-Print Archive

ZORA

Western Sydney ResearchDirect

idUS. Depósito de Investigación Universidad de Sevilla