397 research outputs found

    Implementation of JPEG compression and motion estimation on FPGA hardware

    Full text link
    A hardware implementation of JPEG allows for real-time compression in data intensivve applications, such as high speed scanning, medical imaging and satellite image transmission. Implementation options include dedicated DSP or media processors, FPGA boards, and ASICs. Factors that affect the choice of platform selection involve cost, speed, memory, size, power consumption, and case of reconfiguration. The proposed hardware solution is based on a Very high speed integrated circuit Hardware Description Language (VHDL) implememtation of the codec with prefered realization using an FPGA board due to speed, cost and flexibility factors; The VHDL language is commonly used to model hardware impletations from a top down perspective. The VHDL code may be simulated to correct mistakes and subsequently synthesized into hardware using a synthesis tool, such as the xilinx ise suite. The same VHDL code may be synthesized into a number of sifferent hardware architetcures based on constraints given. For example speed was the major constraint when synthesizing the pipeline of jpeg encoding and decoding, while chip area and power consumption were primary constraints when synthesizing the on-die memory because of large area. Thus, there is a trade off between area and speed in logic synthesis

    Enhancement of Digital Photo Frame Capabilities With Dedicated Hardware

    Get PDF
    Photo frames have come a long way since the typical ones that needed to have a photo printed and stuck on them. Today in this digital era we have a new concept, named digital photo frame, a modern representation of the conventional photo frame. A digital photo frame is basically a picture frame that displays photos without the need to print them. They are available in a variety of sizes and with varied configurations. A typical frame varies in size from 7 inches to 20 inches. There are also key chain sized frames available. These frames also support a variety of formats like .jpeg, .tiff, .bmp and so on. Most of the frames provide an option to run the photos in a sequential or random manner as a slideshow with an adjustable time interval. The mode of input of the photos to the frame is also multi-fold. It can be done directly via the memory card of the camera, or else various memory devices like USB drives, SD Cards, MMC Cards and so on can be used. Nowadays even Bluetooth technology is being used. Another option that is becoming quite popular is that, users can take their photos directly from the Internet from sites like Flickr, Picassa or from their e-mail. Also these frames generally come with built in speakers and with remote controls. Our initial objective was to decide on which all features can be added to the Digital Photo Frame that we design. For this purpose we conducted simulation exercises in MATLAB so as to prove its feasibility. This simulation exercise was divided into two parts. The first part was to perform compression and decompression and the second half dealt with the various enhancements that can be added to the frame. For our compression and decompression we considered the JPEG standard. Joint Photographic Experts Group - an ISO/ITU standard for compressing still images. The JPEG format is very popular due to its variable compression range. A few limitations of JPEG include the fact that it is lossy and also not great for displaying text. The common extension for it include *.jpg, *.jff, *.m-jpeg,*.mpeg The various enhancement features that we tested for feasibility include Mean Filter, Median Filter, Image Sharpening, Negative Image Extraction, Logarithmic Transformations, Power Law Correction (Gamma Correction), Contrast Stretching, Grey Level Slicing, Bit Plane Slicing, Laplace Filtering. We then proceeded onto the hardware implementation of the above said features. We only implemented a handful of features owing to the complexity of design and lack of time. We first implemented the Compression and Decompression algorithm. The two enhancement features we implemented were Laplace Filter and Median Filter. For our implementation we used the VIRTEX 2 FPGA Board

    Gbit/second lossless data compression hardware

    Get PDF
    This thesis investigates how to improve the performance of lossless data compression hardware as a tool to reduce the cost per bit stored in a computer system or transmitted over a communication network. Lossless data compression allows the exact reconstruction of the original data after decompression. Its deployment in some high-bandwidth applications has been hampered due to performance limitations in the compressing hardware that needs to match the performance of the original system to avoid becoming a bottleneck. Advancing the area of lossless data compression hardware, hence, offers a valid motivation with the potential of doubling the performance of the system that incorporates it with minimum investment. This work starts by presenting an analysis of current compression methods with the objective of identifying the factors that limit performance and also the factors that increase it. [Continues.

    VHDL modeling and synthesis of the JPEG-XR inverse transform

    Get PDF
    This work presents a pipelined VHDL implementation of the inverse lapped biorthogonal transform used in the decompression process of the soon to be released JPEG-XR still image standard format. This inverse transform involves integer only calculations using lifting operations and Kronecker products. Divisions and multiplications by small integer coefficients are implemented using a bit shift and add technique resulting in a multiplier-less implementation with 736 instances of addition. When targeted to an Altera Stratix II FPGA with a 50 MHz system clock, this design is capable of completing the inverse transform of an 8400 x 6600 pixel image in less than 70 ms

    NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

    Get PDF
    Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm2^2. As further proof of NullHop's usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations
    corecore