Search CORE

1,141 research outputs found

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Photonics design tool for advanced CMOS nodes

Author: Alloatti Luca
Popovic Milos
Ram Rajeev Jagga
Stojanovic Vladimir
Wade Mark
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/04/2015
Field of study

Recently, the authors have demonstrated large-scale integrated systems with several million transistors and hundreds of photonic elements. Yielding such large-scale integrated systems requires a design-for-manufacture rigour that is embodied in the 10 000 to 50 000 design rules that these designs must comply within advanced complementary metal-oxide semiconductor manufacturing. Here, the authors present a photonic design automation tool which allows automatic generation of layouts without design-rule violations. This tool is written in SKILL, the native language of the mainstream electric design automation software, Cadence. This allows seamless integration of photonic and electronic design in a single environment. The tool leverages intuitive photonic layer definitions, allowing the designer to focus on the physical properties rather than on technology-dependent details. For the first time the authors present an algorithm for removal of design-rule violations from photonic layouts based on Manhattan discretisation, Boolean and sizing operations. This algorithm is not limited to the implementation in SKILL, and can in principle be implemented in any scripting language. Connectivity is achieved with software-defined waveguide ports and low-level procedures that enable auto-routing of waveguide connections.Comment: 5 pages, 10 figure

arXiv.org e-Print Archive

DSpace@MIT

The Chameleon Architecture for Streaming DSP Applications

Author: Burgwal Marcel D. van de
Heysters Paul M.
Hölzenspies Philip K.F.
Kokkeler André B.J.
Smit Gerard J.M.
Wolkotte Pascal T.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2007
Field of study

We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm

^2

in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

University of Twente Research Information

Implementation of arithmetic primitives using truly deep submicron technology (TDST)

Author: Eshraghian Sholeh
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2004
Field of study

The invention of the transistor in 1947 at Bell Laboratories revolutionised the electronics industry and created a powerful platform for emergence of new industries. The quest to increase the number of devices per chip over the last four decades has resulted in rapid transition from Small-Scale-Integration (SSI) and Large-Scale-lntegration (LSI), through to the Very-Large-Scale-Integration (VLSI) technologies, incorporating approximately 10 to 100 million devices per chip. The next phase in this evolution is the Ultra-Large-Scale-Integration (ULSI) aiming to realise new application domains currently not accessible to CMOS technology. Although technology is continuously evolving to produce smaller systems with minimised power dissipation, the IC industry is facing major challenges due to constraints on power density (W/cm2) and high dynamic (operating) and static (standby) power dissipation. Mobile multimedia communication and optical based technologies have rapidly become a significant area of research and development challenging a variety of technological fronts. The future emergence or 4G (4th Generation) wireless communications networks is further driving this development, requiring increasing levels of media rich content. The processing requirements for capture, conversion, compression, decompression, enhancement and display of higher quality multimedia, place heavy demands on current ULSI systems. This is also apparent for mobile applications and intelligent optical networks where silicon chip area and power dissipation become primary considerations. In addition to the requirements for very low power, compact size and real-time processing, the rapidly evolving nature of telecommunication networks means that flexible soft programmable systems capable of adaptation to support a number of different standards and/or roles become highly desirable. In order to fully realise the capabilities promised by the 4G and supporting intelligent networks, new enabling technologies arc needed to facilitate the next generation of personal communications devices. Most of the current solutions to meet these challenges are based on various implementations of conventional architectures. For decades, silicon has been the main platform of computing, however it is slow, bulky, runs too hot, and is too expensive. Thus, new approaches to architectures, driving multimedia and future telecommunications systems, are needed in order to extend the life cycle of silicon technology. The emergence of Truly Deep Submicron Technology (TDST) and related 3-D interconnection technologies have provided potential alternatives from conventional architectures to 3-D system solutions, through integration of IDST, Vertical Software Mapping and Intelligent Interconnect Technology (IIT). The concept of Soft-Chip Technology (SCT) entails integration of Soft• Processing Circuits with Soft-Configurable Circuits . This concept can effectively manipulate hardware primitives through vertical integration of control and data. Thus the notion of 3-D Soft-Chip emerges as a new design algorithm for content-rich multimedia, telecommunication and intelligent networking system applications. 3•D architectures (design algorithms used suitable for 3-D soft-chip technology), are driven by three factors. The first is development of new device technology (TDST) that can support new architectures with complexities of 100M to 1000M devices. The second is development of advanced wafer bonding techniques such as Indium bump and the more futuristic optical interconnects for 3-D soft-chip mapping. The third is related to improving the performance of silicon CMOS systems as devices continue to scale down in dimensions. One of the fundamental building blocks of any computer system is the arithmetic component. Optimum performance of the system is determined by the efficiency of each individual component, as well as the network as a whole entity. Development of configurable arithmetic primitives is the fundamental focus in 3-D architecture design where functionality can be implemented through soft configurable hardware elements. Therefore the ability to improve the performance capability of a system is of crucial importance for a successful design. Important factors that predict the efficiency of such arithmetic components are: • The propagation delay of the circuit, caused by the gate, diffusion and wire capacitances within !he circuit, minimised through transistor sizing. and • Power dissipation, which is generally based on node transition activity. [2] Although optimum performance of 3-D soft-chip systems is primarily established by the choice of basic primitives such as adders and multipliers, the interconnecting network also has significant degree of influence on !he efficiency of the system. 3-D superposition of devices can decrease interconnect delays by up to 60% compared to a similar planar architecture. This research is based on development and implementation of configurable arithmetic primitives, suitable to the 3-D architecture, and has these foci: • To develop a variety of arithmetic components such as adders and multipliers with particular emphasis on minimum area and compatible with 3-D soft-chip design paradigm. • To explore implementation of configurable distributed primitives for arithmetic processing. This entails optimisation of basic primitives, and using them as part of array processing. In this research the detailed designs of configurable arithmetic primitives are implemented using TDST O.l3µm (130nm) technology, utilising CAD software such as Mentor Graphics and Cadence in Custom design mode, carrying through design, simulation and verification steps

Research Online @ ECU

Modeling of Thermally Aware Carbon Nanotube and Graphene Based Post CMOS VLSI Interconnect

Author: Mohsin K M
Publication venue: LSU Digital Commons
Publication date: 06/11/2017
Field of study

This work studies various emerging reduced dimensional materials for very large-scale integration (VLSI) interconnects. The prime motivation of this work is to find an alternative to the existing Cu-based interconnect for post-CMOS technology nodes with an emphasis on thermal stability. Starting from the material modeling, this work includes material characterization, exploration of electronic properties, vibrational properties and to analyze performance as a VLSI interconnect. Using state of the art density functional theories (DFT) one-dimensional and two-dimensional materials were designed for exploring their electronic structures, transport properties and their circuit behaviors. Primarily carbon nanotube (CNT), graphene and graphene/copper based interconnects were studied in this work. Being reduced dimensional materials the charge carriers in CNT(1-D) and in graphene (2-D) are quantum mechanically confined as a result of this free electron approximation fails to explain their electronic properties. For same reason Drude theory of metals fails to explain electronic transport phenomena. In this work Landauer transport theories using non-equilibrium Green function (NEGF) formalism was used for carrier transport calculation. For phonon transport studies, phenomenological Fourier’s heat diffusion equation was used for longer interconnects. Semi-classical BTE and Landauer transport for phonons were used in cases of ballistic phonon transport regime. After obtaining self-consistent electronic and thermal transport coefficients, an equivalent circuit model is proposed to analyze interconnects’ electrical performances. For material studies, CNTs of different variants were analyzed and compared with existing copper based interconnects and were found to be auspicious contenders with integrational challenges. Although, Cu based interconnect is still outperforming other emerging materials in terms of the energy-delay product (1.72 fJ-ps), considering the electromigration resistance graphene Cu hybrid interconnect proposed in this dissertation performs better. Ten times more electromigration resistance is achievable with the cost of only 30% increase in energy-delay product. This unique property of this proposed interconnect also outperforms other studied alternative materials such as multiwalled CNT, single walled CNT and their bundles

Louisiana State University

Multilayer Spiking Neural Network for Audio Samples Classification Using SpiNNaker

Author: Cerezuela Escudero Elena
Domínguez Morales Juan Pedro
Domínguez Morales Manuel Jesús
Gutiérrez Galán Daniel
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Ríos Navarro José Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Audio classification has always been an interesting subject of research inside the neuromorphic engineering field. Tools like Nengo or Brian, and hardware platforms like the SpiNNaker board are rapidly increasing in popularity in the neuromorphic community due to the ease of modelling spiking neural networks with them. In this manuscript a multilayer spiking neural network for audio samples classification using SpiNNaker is presented. The network consists of different leaky integrate-and-fire neuron layers. The connections between them are trained using novel firing rate based algorithms and tested using sets of pure tones with frequencies that range from 130.813 to 1396.91 Hz. The hit rate percentage values are obtained after adding a random noise signal to the original pure tone signal. The results show very good classification results (above 85 % hit rate) for each class when the Signal-to-noise ratio is above 3 decibels, validating the robustness of the network configuration and the training step.Ministerio de Economía y Competitividad TEC2012-37868-C04-02Junta de Andalucía P12-TIC-130

idUS. Depósito de Investigación Universidad de Sevilla