154 research outputs found

    Hardware acceleration architectures for MPEG-Based mobile video platforms: a brief overview

    Get PDF
    This paper presents a brief overview of past and current hardware acceleration (HwA) approaches that have been proposed for the most computationally intensive compression tools of the MPEG-4 standard. These approaches are classified based on their historical evolution and architectural approach. An analysis of both evolutionary and functional classifications is carried out in order to speculate on the possible trends of the HwA architectures to be employed in mobile video platforms

    CORDIC algorithm and it’s applications in DSP

    Get PDF
    OBJECTIVE: The digital signal processing landscape has long been dominated by the microprocessors with enhancements such as single cycle multiply-accumulate instructions and special addressing modes. While these processors are low cost and offer extreme flexibility, they are often not fast enough for truly demanding DSP tasks. The advent of reconfigurable logic computers permits the higher speeds of dedicated hardware solutions at costs that are competitive with the traditional software approach. Unfortunately algorithms optimized for these microprocessors based systems do not map well into hardware. While hardware efficient solutions often exist, the dominance of the software systems has kept these solutions out of the spotlight. Among these hardware- efficient algorithms is a class of iterative solutions for trigonometric and other transcendental functions that use only shifts and adds to perform. The trigonometric functions are based on vector rotations, while other functions such as square root are implemented using an incremental expression of the desired function. The trigonometric algorithm is called CORDIC an acronym for Coordinate Rotation Digital Computer. The incremental functions are performed with a very simple extension to the hardware architecture and while not CORDIC in the strict sense, are often included because of the close similarity. The CORDIC algorithms generally produce one additional bit of accuracy for each iteration. DESCRIPTION: A detailed study on various modes of CORDIC algorithm is done. First of all a study is made how the CORDIC algorithm is derived from the general vector equation. Then a study is done regarding the various modes of the CORDIC algorithm and how it can be used to find the sine, cosine, tan and logarithm functions, its use in conversion of coordinate systems. An attempt is made to carry out a rigorous study of its use in DSP oriented applications AND how it has revolutionized the DSP scenario. Finally simulations are carried out using MATLAB to support the purpose of our study. RESULTS The results clearly bring out the advantage of using CORDIC algorithm. First of all the sine and cosine of any angle could be found out easily. Similar is the case of logarithm and hyperbolic functions. The simulation results prove the fact that the hardware complexity gets reduced by using the CORDIC algorithm. A large no of plots were obtained for different 7 functions. Finally the implementation in DCT was carried out and the results obtained were in line with those of the theoretical values. CONCLUSION The CORDIC algorithms presented in this paper are well known in the research and super computing circles. Here the basic CORDIC algorithm and a partial list of potential applications of potential applications of a CORDIC based processor array to digital signal processing is presented. The CORDIC based DCT architecture for low power design has been proposed. The proposed multiplierless CORDIC based DCT architecture produces high throughput and is easy to implementing VLSI. The proposed architecture reduced the input data range for the CORDIC processor by split and the no of compensation iterations in CORDIC based DCT computation by utilizing that most images have similar neighboring pixels. The project also shows that a tool is available for use in FPGA based computing machines, which are the likely basis for the next generation DSP systems

    Design of 2D discrete cosine transform using CORDIC architectures in VHDL

    Get PDF
    The Discrete Cosine Transform is one of the most widely transform techniques in digital signal processing. In addition, this is also most computationally intensive transforms which require many multiplications and additions. Real time data processing necessitates the use of special purpose hardware which involves hardware efficiency as well as high throughput. Many DCT algorithms were proposed in order to achieve high speed DCT. Those architectures which involves multipliers, for example Chen’s algorithm has less regular architecture due to complex routing and requires large silicon area. On the other hand, the DCT architecture based on distributed arithmetic (DA) which is also a multiplier less architecture has the inherent disadvantage of less throughputs because of the ROM access time and the need of accumulator. Also this DA algorithm requires large silicon area if it requires large ROM size. Systolic array architecture for the real-time DCT computation may have the large number of gates and clock skew problem. The other ways of implementation of DCT which involves in multiplierless, thus power efficient and which results in regular architecture and less complicated routing, consequently less area, simultaneously lead to high throughput. So for that purpose CORDIC seems to be a best solution. CORDIC offers a unified iterative formulation to efficiently evaluate the rotation operation. This thesis presents the implementation of 2D Discrete Cosine Transform (DCT) using the Angle Recoded (AR) Cordic algorithm, the new scaling less CORDIC algorithm and the conventional Chen’s algorithm which is multiplier dependant algorithm. The 2D DCT is implemented by exploiting the Separability property of 2D Discrete Cosine Transform. Here first one dimensional DCT is designed first and later a transpose buffer which consists of 64 memory elements, fully pipelined is designed. Later all these blocks are joined with the help of a controller element which is a mealy type FSM which produces some status signals also. The three resulting architectures are all well synthesized in Xilinx 9.1ise, simulated in Modelsim 5.6f and the power is calculated in Xilinx Xpower. Results prove that AR Cordic algorithm is better than Chen’s algorithm, even the new scaling less CORDIC algorithm

    Energy-efficient acceleration of MPEG-4 compression tools

    Get PDF
    We propose novel hardware accelerator architectures for the most computationally demanding algorithms of the MPEG-4 video compression standard-motion estimation, binary motion estimation (for shape coding), and the forward/inverse discrete cosine transforms (incorporating shape adaptive modes). These accelerators have been designed using general low-energy design philosophies at the algorithmic/architectural abstraction levels. The themes of these philosophies are avoiding waste and trading area/performance for power and energy gains. Each core has been synthesised targeting TSMC 0.09 μm TCBN90LP technology, and the experimental results presented in this paper show that the proposed cores improve upon the prior art

    Parametrized Architecture for Hough Transform Recursive Evaluation

    Get PDF
    Paper submitted to International Workshop on Spectral Methods and Multirate Signal Processing (SMMSP), Barcelona, España, 2003.The Hough Transform (HT) is a useful technique in image segmentation, concretely for geometrical primitive detection. A Convolution-Based Recursive Method (CBRM) is presented for function evaluation. In this generic approach, calculations are carried out by an unique parametric formula which provides all function points by successive iterations. The case of combined trigonometric functions involved in the calculation of the HT is analyzed under this scope. An architecture for reconfigurable FPGA-based hardware, using Distributed Arithmetic (DA) implements the design. The CBRM implementation provides improvements such as memory and hardware resources saving, as well as a good balance between speed and error-dependable precision

    Hough Transform recursive evaluation using Distributed Arithmetic

    Get PDF
    Paper submitted to the IFIP International Conference on Very Large Scale Integration (VLSI-SOC), Darmstadt, Germany, 2003.The Hough Transform (HT) is a useful technique in image segmentation, concretely for geometrical primitive detection. A Convolution-Based Recursive Method (CBRM) is presented for generic function evaluation. In this approach, calculations are carried out by a unique parametric formula which provides all function points by successive iteration. The case of combined trigonometric functions involved in the calculation of the HT is analyzed under this scope. An architecture for reconfigurable FPGA-based hardware, using Distributed Arithmetic (DA) implements the design. It provides memory and hardware resource saving as well as speed improvements according to the experiments carried out with the HT
    corecore