149 research outputs found
Perceptual Zero-Tree Coding with Efficient Optimization for Embedded Platforms
This study proposes a block-edge-based perceptual zero-tree coding (PZTC) method, which is implemented with efficientoptimization on the embedded platform. PZTC combines two novel compression concepts for coding efficiency and quality:block-edge detection (BED) and the low-complexity and low-memory entropy coder (LLEC). The proposed PZTC wasimplemented as a fixed-point version and optimized on the DSP-based platform based on both the presented platformindependentand platform-dependent optimization technologies. For platform-dependent optimization, this study examinesthe fixed-point PZTC and analyzes the complexity to optimize PZTC toward achieving an optimal coding efficiency.Furthermore, hardware-based platform-dependent optimizations are presented to reduce the memory size. Theperformance, such as compression quality and efficiency, is validated by experimental results
Analysis of runtime re-configuration systems
In recent years Programmable Logic Devices (PLD) and in particular Field Programmable Gate Arrays (FPGAs) have seen a tremendous increase in sales and applications in the area of embedded systems. The main advantage of FPGAs is the flexibility that they offer a designer in reconfiguring the hardware. The flexibility achieved through re-configuration of FPGAs usually incurs an overhead of extra execution time, data memory and also power dissipation; FPGAs provide an ideal template for run-time reconfigurable (RTR) designs. Only recently have RTR enabling design tools that bypass the traditional synthesis and bitstream generation process for FPGAs become available, JBits is one of them. With run-time reconfiguration of FPGAs, we can perform partial reconfiguration, which allows reconfiguration of a part of an FPGA while the other part is executing some functional computation. The partial reconfiguration of a function can be performed earlier than the time when the function is really needed. Such configuration pre-fetch can hide the reconfiguration overhead more effectively; This thesis will implement a reconfigurable system and study the effect of runtime reconfiguration using VERILOG and a new Java based tool JBITS. This work will provide pointers to high level synthesis tools targeting runtime re-configuration
Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems
The generic matrix multiply (GEMM) function is the core element of
high-performance linear algebra libraries used in many
computationally-demanding digital signal processing (DSP) systems. We propose
an acceleration technique for GEMM based on dynamically adjusting the
imprecision (distortion) of computation. Our technique employs adaptive scalar
companding and rounding to input matrix blocks followed by two forms of packing
in floating-point that allow for concurrent calculation of multiple results.
Since the adaptive companding process controls the increase of concurrency (via
packing), the increase in processing throughput (and the corresponding increase
in distortion) depends on the input data statistics. To demonstrate this, we
derive the optimal throughput-distortion control framework for GEMM for the
broad class of zero-mean, independent identically distributed, input sources.
Our approach converts matrix multiplication in programmable processors into a
computation channel: when increasing the processing throughput, the output
noise (error) increases due to (i) coarser quantization and (ii) computational
errors caused by exceeding the machine-precision limitations. We show that,
under certain distortion in the GEMM computation, the proposed framework can
significantly surpass 100% of the peak performance of a given processor. The
practical benefits of our proposal are shown in a face recognition system and a
multi-layer perceptron system trained for metadata learning from a large music
feature database.Comment: IEEE Transactions on Signal Processing (vol. 60, 2012
An Efficient Power Estimation Methodology for Complex RISC Processor-based Platforms
International audienceIn this contribution, we propose an efficient power estima- tion methodology for complex RISC processor-based plat- forms. In this methodology, the Functional Level Power Analysis (FLPA) is used to set up generic power models for the different parts of the system. Then, a simulation framework based on virtual platform is developed to evalu- ate accurately the activities used in the related power mod- els. The combination of the two parts above leads to a het- erogeneous power estimation that gives a better trade-off be- tween accuracy and speed. The usefulness and effectiveness of our proposed methodology is validated through ARM9 and ARM CortexA8 processor designed respectively around the OMAP5912 and OMAP3530 boards. This efficiency and the accuracy of our proposed methodology is evaluated by using a variety of basic programs to complete media bench- marks. Estimated power values are compared to real board measurements for the both ARM940T and ARM CortexA8 architectures. Our obtained power estimation results pro- vide less than 3% of error for ARM940T processor, 3.5% for ARM CortexA8 processor-based system and 1x faster compared to the state-of-the-art power estimation tools
Recommended from our members
Hardware and Software Codesign of a JPEG2000 Watermarking Encoder
Analog technology has been around for a long time. The use of analog technology is necessary since we live in an analog world. However, the transmission and storage of analog technology is more complicated and in many cases less efficient than digital technology. Digital technology, on the other hand, provides fast means to be transmitted and stored. Digital technology continues to grow and it is more widely used than ever before. However, with the advent of new technology that can reproduce digital documents or images with unprecedented accuracy, it poses a risk to the intellectual rights of many artists and also on personal security. One way to protect intellectual rights of digital works is by embedding watermarks in them. The watermarks can be visible or invisible depending on the application and the final objective of the intellectual work. This thesis deals with watermarking images in the discrete wavelet transform domain. The watermarking process was done using the JPEG2000 compression standard as a platform. The hardware implementation was achieved using the ALTERA DSP Builder and SIMULINK software to program the DE2 ALTERA FPGA board. The JPEG2000 color transform and the wavelet transformation blocks were implemented using the hardware-in-the-loop (HIL) configuration
Hardware Implementation of DWT for Image compression using SPIHT Algorithm
Abstract-In this paper,a DWT-based image processing system is developed on Xilinx Spartan3 Field Programmable Gate Array (FPGA) device using embedded development kit (EDK) tools from Xilinx. Two different hardware architectures of two dimensional (2-D) DWT have been implemented as a coprocessor in an embedded system. One is direct implementation of 2-D DWT by cascading two 1-D DWT. Another is 2-D DWT implementation with control and architecture optimization. In addition, the hardware cost of these two architectures is compared for benchmark images
- âŠ