19,106 research outputs found

    Dynamic Power Management for Neuromorphic Many-Core Systems

    Full text link
    This work presents a dynamic power management architecture for neuromorphic many core systems such as SpiNNaker. A fast dynamic voltage and frequency scaling (DVFS) technique is presented which allows the processing elements (PE) to change their supply voltage and clock frequency individually and autonomously within less than 100 ns. This is employed by the neuromorphic simulation software flow, which defines the performance level (PL) of the PE based on the actual workload within each simulation cycle. A test chip in 28 nm SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct PLs. By measurement of three neuromorphic benchmarks it is shown that the total PE power consumption can be reduced by 75%, with 80% baseline power reduction and a 50% reduction of energy per neuron and synapse computation, all while maintaining temporary peak system performance to achieve biological real-time operation of the system. A numerical model of this power management model is derived which allows DVFS architecture exploration for neuromorphics. The proposed technique is to be used for the second generation SpiNNaker neuromorphic many core system

    Energy-efficient acceleration of MPEG-4 compression tools

    Get PDF
    We propose novel hardware accelerator architectures for the most computationally demanding algorithms of the MPEG-4 video compression standard-motion estimation, binary motion estimation (for shape coding), and the forward/inverse discrete cosine transforms (incorporating shape adaptive modes). These accelerators have been designed using general low-energy design philosophies at the algorithmic/architectural abstraction levels. The themes of these philosophies are avoiding waste and trading area/performance for power and energy gains. Each core has been synthesised targeting TSMC 0.09 μm TCBN90LP technology, and the experimental results presented in this paper show that the proposed cores improve upon the prior art

    Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

    Get PDF

    Energy Aware Design and Analysis for Synchronous and Asynchronous Circuits

    Get PDF
    Power dissipation has become a major concern for IC designers. Various low power design techniques have been developed for synchronous circuits. Asynchronous circuits, however. have gained more interests recently due to their benefits in lower noise, easy timing control, etc. But few publications on energy reduction techniques for asynchronous logic are available. Power awareness indicates the ability of the system power to scale with changing conditions and quality requirements. Scalability is an important figure-of-merit since it allows the end user to implement operational policy. just like the user of mobile multimedia equipment needs to select between better quality and longer battery operation time. This dissertation discusses power/energy optimization and performs analysis on both synchronous and asynchronous logic. The major contributions of this dissertation include: 1 ) A 2-Dimensional Pipeline Gating technique for synchronous pipelined circuits to improve their power awareness has been proposed. This technique gates the corresponding clock lines connected to registers in both vertical direction (the data flow direction) and horizontal direction (registers within each pipeline stage) based on current input precision. 2) Two energy reduction techniques, Signal Bypassing & Insertion and Zero Insertion. have been developed for NCL circuits. Both techniques use Nulls to replace redundant Data 0\u27s based on current input precision in order to reduce the switching activity while Signal Bypassing & Insertion is for non-pipelined NCI, circuits and Zero Insertion is for pipelined counterparts. A dynamic active-bit detection scheme is also developed as an expansion. 3) Two energy estimation techniques, Equivalent Inverter Modeling based on Input Mapping in transistor-level and Switching Activity Modeling in gate-level, have been proposed. The former one is for CMOS gates with feedbacks and the latter one is for NCL circuits

    Implementing video compression algorithms on reconfigurable devices

    Get PDF
    The increasing density offered by Field Programmable Gate Arrays(FPGA), coupled with their short design cycle, has made them a popular choice for implementing a wide range of algorithms and complete systems. In this thesis the implementation of video compression algorithms on FPGAs is studied. Two areas are specifically focused on; the integration of a video encoder into a complete system and the power consumption of FPGA based video encoders. Two FPGA based video compression systems are described, one which targets surveillance applications and one which targets video conferencing applications. The FPGA video surveillance system makes use of a novel memory format to improve the efficiency with which input video sequences can be loaded over the system bus. The power consumption of a FPGA video encoder is analyzed. The results indicating that the motion estimation encoder stage requires the most power consumption. An algorithm, which reuses the intra prediction results generated during the encoding process, is then proposed to reduce the power consumed on an FPGA video encoder’s external memory bus. Finally, the power reduction algorithm is implemented within an FPGA video encoder. Results are given showing that, in addition to reducing power on the external memory bus, the algorithm also reduces power in the motion estimation stage of a FPGA based video encoder

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    Static noise margin analysis for CMOS logic cells in near-threshold

    Get PDF
    The advancement of semiconductor technology enabled the fabrication of devices with faster switching activity and chips with higher integration density. However, these advances are facing new impediments related to energy and power dissipation. Besides, the increasing demand for portable devices leads the circuit design paradigm to prioritize energy efficiency instead of performance. Altogether, this scenario motivates engineers towards reducing the supply voltage to the near and subthreshold regime to increase the lifespan of battery-powered devices. Even though operating in these regime offer interesting energy-frequency trade-offs, it brings challenges concerning noise tolerance. As the supply voltage reduces, the available noise margins decrease, and circuits become more prone to functional failures. In addition, near and subthreshold circuits are more susceptible to manufacturing variability, hence further aggravating noise issues. Other issues, such as wire minimization and gate fan-out, also contribute to the relevance of evaluating the noise margin of circuits early in the design Accordingly, this work investigates how to improve the static noise margin of digital synchronous circuits that will operate at the near/subthreshold regime. This investigation produces a set of three original contributions. The first is an automated tool to estimate the static noise margin of CMOS combinational cells. The second contribution is a realistic static noise margin estimation methodology that considers process-voltage-temperature variations. Results show that the proposed methodology allows to reduce up to 70% of the static noise margin pessimism. Finally, the third contribution is the noise-aware cell design methodology and the inclusion of a noise evaluation of complex circuits during the logic synthesis. The resulting library achieved higher static noise margin (up to 24%) and less spread among different cells (up to 62%).Os avanços na tecnologia de semicondutores possibilitou que se fabricasse dispositivos com atividade de chaveamento mais rápida e com maior capacidade de integração de transistores. Estes avanços, todavia, impuseram novos empecilhos relacionados com a dissipação de potência e energia. Além disso, a crescente demanda por dispositivos portáteis levaram à uma mudança no paradigma de projeto de circuitos para que se priorize energia ao invés de desempenho. Este cenário motivou à reduzir a tensão de alimentação com qual os dispositivos operam para um regime próximo ou abaixo da tensão de limiar, com o objetivo de aumentar sua duração de bateria. Apesar desta abordagem balancear características de performance e energia, ela traz novos desafios com relação a tolerância à ruído. Ao reduzirmos a tensão de alimentação, também reduz-se a margem de ruído disponível e, assim, os circuitos tornam-se mais suscetíveis à falhas funcionais. Somado à este efeito, circuitos com tensões de alimentação nestes regimes são mais sensíveis à variações do processo de fabricação, logo agravando problemas com ruído. Existem também outros aspectos, tais como a miniaturização das interconexões e a relação de fan-out de uma célula digital, que incentivam a avaliação de ruído nas fases iniciais do projeto de circuitos integrados Por estes motivos, este trabalho investiga como aprimorar a margem de ruído estática de circuitos síncronos digitais que irão operar em tensões no regime de tensão próximo ou abaixo do limiar. Esta investigação produz um conjunto de três contribuições originais. A primeira é uma ferramenta capaz de avaliar automaticamente a margem de ruído estática de células CMOS combinacionais. A segunda contribuição é uma metodologia realista para estimar a margem de ruído estática considerando variações de processo, tensão e temperatura. Os resultados obtidos mostram que a metodologia proposta permitiu reduzir até 70% do pessimismo das margens de ruído estática, Por último, a terceira contribuição é um fluxo de projeto de células combinacionais digitais considerando ruído, e uma abordagem para avaliar a margem de ruído estática de circuitos complexos durante a etapa de síntese lógica. A biblioteca de células resultante deste fluxo obteve maior margem de ruído (até 24%) e menor variação entre diferentes células (até 62%)
    corecore