5 research outputs found

    A low-power asynchronous VLSI FIR filter

    Get PDF
    An asynchronous FIR filter, based on a Single Bit-Plane architecture with a data-dependent, dynamic-logic implementation, is presented. Its energy consumption and sample computation delay are shown to correlate approximately linearly with the total number of ones in its coeflcient-set. The proposed architecture has the property that coefficients in a Sign-Magnitude representation can be handled at negligible overhead which, for typical filter coefficient-sets, is shown to offer significant benefits to both energy consumption and throughput. Transistor level simulations show energy consumption to be lower than in previously reported designs

    Conjoined Processor: A Fault Tolerant High Performance Microprocessor

    Get PDF
    Reliability has become a serious concern as systems embrace nanometer technologies. Current reliability enhancement techniques cause slowdown in processor operation. In this work, we propose a novel approach that organizes redundancy in a special way to provide high degree of fault tolerance. Our approach improves performance by reliably adapting the system clock frequency during run time, based on the current running application and environmental conditions. The organization of redundancy in the proposed conjoined processor supports overclocking, provides concurrent error detection and recovery capability for soft errors, timing errors, intermittent faults and detects silicon defects. The fast recovery process requires no checkpointing and takes three cycles. Post-layout timing annotated gate level simulations of a conjoined two stage arithmetic pipeline shows that our approach achieves near 100% fault coverage, and a performance improvement of 21%. A five stage in-order conjoined pipeline processor was designed and implemented to verify correctness of the proposed architecture

    Power-Aware Design Methodologies for FPGA-Based Implementation of Video Processing Systems

    Get PDF
    The increasing capacity and capabilities of FPGA devices in recent years provide an attractive option for performance-hungry applications in the image and video processing domain. FPGA devices are often used as implementation platforms for image and video processing algorithms for real-time applications due to their programmable structure that can exploit inherent spatial and temporal parallelism. While performance and area remain as two main design criteria, power consumption has become an important design goal especially for mobile devices. Reduction in power consumption can be achieved by reducing the supply voltage, capacitances, clock frequency and switching activities in a circuit. Switching activities can be reduced by architectural optimization of the processing cores such as adders, multipliers, multiply and accumulators (MACS), etc. This dissertation research focuses on reducing the switching activities in digital circuits by considering data dependencies in bit level, word level and block level neighborhoods in a video frame. The bit level data neighborhood dependency consideration for power reduction is illustrated in the design of pipelined array, Booth and log-based multipliers. For an array multiplier, operands of the multipliers are partitioned into higher and lower parts so that the probability of the higher order parts being zero or one increases. The gating technique for the pipelined approach deactivates part(s) of the multiplier when the above special values are detected. For the Booth multiplier, the partitioning and gating technique is integrated into the Booth recoding scheme. In addition, a delay correction strategy is developed for the Booth multiplier to reduce the switching activities of the sign extension part in the partial products. A novel architecture design for the computation of log and inverse-log functions for the reduction of power consumption in arithmetic circuits is also presented. This also utilizes the proposed partitioning and gating technique for further dynamic power reduction by reducing the switching activities. The word level and block level data dependencies for reducing the dynamic power consumption are illustrated by presenting the design of a 2-D convolution architecture. Here the similarities of the neighboring pixels in window-based operations of image and video processing algorithms are considered for reduced switching activities. A partitioning and detection mechanism is developed to deactivate the parallel architecture for window-based operations if higher order parts of the pixel values are the same. A neighborhood dependent approach (NDA) is incorporated with different window buffering schemes. Consideration of the symmetry property in filter kernels is also applied with the NDA method for further reduction of switching activities. The proposed design methodologies are implemented and evaluated in a FPGA environment. It is observed that the dynamic power consumption in FPGA-based circuit implementations is significantly reduced in bit level, data level and block level architectures when compared to state-of-the-art design techniques. A specific application for the design of a real-time video processing system incorporating the proposed design methodologies for low power consumption is also presented. An image enhancement application is considered and the proposed partitioning and gating, and NDA methods are utilized in the design of the enhancement system. Experimental results show that the proposed multi-level power aware methodology achieves considerable power reduction. Research work is progressing In utilizing the data dependencies in subsequent frames in a video stream for the reduction of circuit switching activities and thereby the dynamic power consumption

    Low Power and High Speed Multiplication Design Through Mixed Number Representations

    No full text
    A low power multiplication algorithm and its VLSI architecture using a mixed number representation is proposed. The reduced switching activity and low power dissipation are achieved through the Sign-Magnitude (SM) notation for the multiplicand and through a novel design of the Redundant Binary (RB) adder and Booth decoder. The high speed operation is achieved through the Carry-Propagation-Free (CPF} accumulation of the Partial Products (PP) by using the RB notation. Analysis showed that the switching activity in the PP generation process can be reduced on average by 90%. Compared to the same type of multipliers [I, 2, 31, the proposed design dissipates much less power and is 18 % faster on average.

    ASIC implementations of the Viterbi Algorithm

    Get PDF
    corecore