Search CORE

117 research outputs found

Power-Efficient and Low-Latency Memory Access for CMP Systems with Heterogeneous Scratchpad On-Chip Memory

Author: Chen Zhi
Publication venue: UKnowledge
Publication date: 01/01/2013
Field of study

The gradually widening speed disparity of between CPU and memory has become an overwhelming bottleneck for the development of Chip Multiprocessor (CMP) systems. In addition, increasing penalties caused by frequent on-chip memory accesses have raised critical challenges in delivering high memory access performance with tight power and latency budgets. To overcome the daunting memory wall and energy wall issues, this thesis focuses on proposing a new heterogeneous scratchpad memory architecture which is configured from SRAM, MRAM, and Z-RAM. Based on this architecture, we propose two algorithms, a dynamic programming and a genetic algorithm, to perform data allocation to different memory units, therefore reducing memory access cost in terms of power consumption and latency. Extensive and intensive experiments are performed to show the merits of the heterogeneous scratchpad architecture over the traditional pure memory system and the effectiveness of the proposed algorithms

University of Kentucky

Piecewise-polynomial associated transform macromodeling algorithm for fast nonlinear circuit simulation

Author: Fong N
Wong N
Zhang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We present a piecewise-polynomial based associated transform algorithm (PWPAT) for macromodeling nonlinear circuits in system-level circuit design. The generated reduced model can provide both global and local accuracies with the most compact dimension. Numerical examples compare it with existing algorithms and verify its superior accuracy in higher order harmonics simulation over traditional Trajectory Piecewise-Linear (TPWL) approach. © 2013 IEEE.published_or_final_versio

HKU Scholars Hub

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Recommended from our members

Scheduling and Fluid Routing for Flow-Based Microfluidic Laboratories-on-a-Chip

Author: Brisk Philip
Madsen Jan
McDaniel Jeffrey
Minhass Wajid Hassan
Pop Paul
Raagaard Michael Lander
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Microfluidic laboratories-on-a-chip (LoCs) are replacing the conventional biochemical analyzers and are able to integrate the necessary functions for biochemical analysis on-chip. There are several types of LoCs, each having its advantages and limitations. In this paper we are interested in flow-based LoCs, in which a continuous flow of liquid is manipulated using integrated microvalves. By combining several microvalves, more complex units, such as micropumps, switches, mixers, and multiplexers, can be built. We consider that the architecture of the LoC is given, and we are interested in synthesizing an implementation, consisting of the binding of operations in the application to the functional units of the architecture, the scheduling of operations and the routing and scheduling of the fluid flows, such that the application completion time is minimized. To solve this problem, we propose a list scheduling-based application mapping (LSAM) framework and evaluate it by using real-life as well as synthetic benchmarks. When biochemical applications contain fluids that may adsorb on the substrate on which they are transported, the solution is to use rinsing operations for contamination avoidance. Hence, we also propose a rinsing heuristic, which has been integrated in the LSAM framework

eScholarship - University of California

Online Research Database In Technology

Extending the performance of hybrid NoCs beyond the limitations of network heterogeneity

Author: Agyeman
Agyeman
Dally
Duato
Hu
Jerger
OpokuAgyeman
Rose
Siozios
Xie
Publication venue: 'MDPI AG'
Publication date: 01/04/2017
Field of study

To meet the performance and scalability demands of the fast-paced technological growth towards exascale and Big-Data processing with the performance bottleneck of conventional metal based interconnects (wireline), alternative interconnect fabrics such as inhomogeneous three-dimensional integrated Network-on-Chip (3D NoC) and hybrid wired-wireless Network-on-Chip (WiNoC) have emanated as a cost-effective solution for emerging System-on-Chip (SoC) design. However, these interconnects trade-off optimized performance for cost by restricting the number of area and power hungry 3D routers and wireless nodes. Moreover, the non-uniform distributed traffic in chip multiprocessor (CMP) demands an on-chip communication infrastructure which can avoid congestion under high traffic conditions while possessing minimal pipeline delay at low-load conditions. To this end, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in such emerging hybrid NoCs. The proposed router transmits a flit using dimension-ordered routing (DoR) in the bypass datapath at low-loads. When the output port required for intra-dimension bypassing is not available, the packet is routed adaptively to avoid congestion. The router also has a simplified virtual channel allocation (VA) scheme that yields a non-speculative low-latency pipeline. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able balance the traffic in hybrid NoCs to achieve low-latency communication under various traffic loads. Simulation shows that, the proposed router can reduce applications’ execution time by an average of 16.9% compared to low-latency routers such as SWIFT. By reducing the latency between 2D routers (or wired nodes) and 3D routers (or wireless nodes) the proposed router can improve performance efficiency in terms of average packet delay by an average of 45% (or 50%) in 3D NoCs (or WiNoCs)

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

University of Northampton's Research Explorer

UCL Discovery

NECTAR

A multiobjective reconfiguration-aware scheduler for FPGA-based heterogeneous architectures

Author: Cattaneo Riccardo
Deiana Enrico A.
Rabozzi Marco
Santambrogio Marco D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Design methodology and productivity improvement in high speed VLSI circuits

Author: Hossain KM Mozammel
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Spring.Includes bibliographical references.To view the abstract, please see the full text of the document

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Review of Display Technologies Focusing on Power Consumption

Author: González Alonso Ignacio
Rodríguez Fernández María
Zalama Casanova Eduardo
Publication venue: 'MDPI AG'
Publication date: 01/01/2015
Field of study

Producción CientíficaThis paper provides an overview of the main manufacturing technologies of displays, focusing on those with low and ultra-low levels of power consumption, which make them suitable for current societal needs. Considering the typified value obtained from the manufacturer’s specifications, four technologies—Liquid Crystal Displays, electronic paper, Organic Light-Emitting Display and Electroluminescent Displays—were selected in a first iteration. For each of them, several features, including size and brightness, were assessed in order to ascertain possible proportional relationships with the rate of consumption. To normalize the comparison between different display types, relative units such as the surface power density and the display frontal intensity efficiency were proposed. Organic light-emitting display had the best results in terms of power density for small display sizes. For larger sizes, it performs less satisfactorily than Liquid Crystal Displays in terms of energy efficiency.Junta de Castilla y León (Programa de apoyo a proyectos de investigación-Ref. VA036U14)Junta de Castilla y León (programa de apoyo a proyectos de investigación - Ref. VA013A12-2)Ministerio de Economía, Industria y Competitividad (Grant DPI2014-56500-R

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Repositorio Institucional de la Universidad de Oviedo

Directory of Open Access Journals

The Challenges of Hardware Synthesis from C-like Languages

Author: Edwards Stephen A.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2004
Field of study

The relentless increase in the complexity of integrated circuits we can fabricate imposes a continuing need for ways to describe complex hardware succinctly. Because of their ubiquity and flexibility, many have proposed to use the C and C++ languages as specification languages for digital hardware. Yet, tools based on this idea have seen little commercial interest. In this paper, I argue that C/C++ is a poor choice for specifying hardware for synthesis and suggest a set of criteria that the next successful hardware description language should have

CiteSeerX

Columbia University Academic Commons

A Multi-layer Fpga Framework Supporting Autonomous Runtime Partial Reconfiguration

Author: Tan Heng
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2007
Field of study

Partial reconfiguration is a unique capability provided by several Field Programmable Gate Array (FPGA) vendors recently, which involves altering part of the programmed design within an SRAM-based FPGA at run-time. In this dissertation, a Multilayer Runtime Reconfiguration Architecture (MRRA) is developed, evaluated, and refined for Autonomous Runtime Partial Reconfiguration of FPGA devices. Under the proposed MRRA paradigm, FPGA configurations can be manipulated at runtime using on-chip resources. Operations are partitioned into Logic, Translation, and Reconfiguration layers along with a standardized set of Application Programming Interfaces (APIs). At each level, resource details are encapsulated and managed for efficiency and portability during operation. An MRRA mapping theory is developed to link the general logic function and area allocation information to the device related physical configuration level data by using mathematical data structure and physical constraints. In certain scenarios, configuration bit stream data can be read and modified directly for fast operations, relying on the use of similar logic functions and common interconnection resources for communication. A corresponding logic control flow is also developed to make the entire process autonomous. Several prototype MRRA systems are developed on a Xilinx Virtex II Pro platform. The Virtex II Pro on-chip PowerPC core and block RAM are employed to manage control operations while multiple physical interfaces establish and supplement autonomous reconfiguration capabilities. Area, speed and power optimization techniques are developed based on the developed Xilinx prototype. Evaluations and analysis of these prototype and techniques are performed on a number of benchmark and hashing algorithm case studies. The results indicate that based on a variety of test benches, up to 70% reduction in the resource utilization, up to 50% improvement in power consumption, and up to 10 times increase in run-time performance are achieved using the developed architecture and approaches compared with Xilinx baseline reconfiguration flow. Finally, a Genetic Algorithm (GA) for a FPGA fault tolerance case study is evaluated as a ultimate high-level application running on this architecture. It demonstrated that this is a hardware and software infrastructure that enables an FPGA to dynamically reconfigure itself efficiently under the control of a soft microprocessor core that is instantiated within the FPGA fabric. Such a system contributes to the observed benefits of intelligent control, fast reconfiguration, and low overhead

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)