Search CORE

235 research outputs found

Power-Efficient and Low-Latency Memory Access for CMP Systems with Heterogeneous Scratchpad On-Chip Memory

Author: Chen Zhi
Publication venue: UKnowledge
Publication date: 01/01/2013
Field of study

The gradually widening speed disparity of between CPU and memory has become an overwhelming bottleneck for the development of Chip Multiprocessor (CMP) systems. In addition, increasing penalties caused by frequent on-chip memory accesses have raised critical challenges in delivering high memory access performance with tight power and latency budgets. To overcome the daunting memory wall and energy wall issues, this thesis focuses on proposing a new heterogeneous scratchpad memory architecture which is configured from SRAM, MRAM, and Z-RAM. Based on this architecture, we propose two algorithms, a dynamic programming and a genetic algorithm, to perform data allocation to different memory units, therefore reducing memory access cost in terms of power consumption and latency. Extensive and intensive experiments are performed to show the merits of the heterogeneous scratchpad architecture over the traditional pure memory system and the effectiveness of the proposed algorithms

University of Kentucky

Extending the performance of hybrid NoCs beyond the limitations of network heterogeneity

Author: Agyeman
Agyeman
Dally
Duato
Hu
Jerger
OpokuAgyeman
Rose
Siozios
Xie
Publication venue: 'MDPI AG'
Publication date: 01/04/2017
Field of study

To meet the performance and scalability demands of the fast-paced technological growth towards exascale and Big-Data processing with the performance bottleneck of conventional metal based interconnects (wireline), alternative interconnect fabrics such as inhomogeneous three-dimensional integrated Network-on-Chip (3D NoC) and hybrid wired-wireless Network-on-Chip (WiNoC) have emanated as a cost-effective solution for emerging System-on-Chip (SoC) design. However, these interconnects trade-off optimized performance for cost by restricting the number of area and power hungry 3D routers and wireless nodes. Moreover, the non-uniform distributed traffic in chip multiprocessor (CMP) demands an on-chip communication infrastructure which can avoid congestion under high traffic conditions while possessing minimal pipeline delay at low-load conditions. To this end, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in such emerging hybrid NoCs. The proposed router transmits a flit using dimension-ordered routing (DoR) in the bypass datapath at low-loads. When the output port required for intra-dimension bypassing is not available, the packet is routed adaptively to avoid congestion. The router also has a simplified virtual channel allocation (VA) scheme that yields a non-speculative low-latency pipeline. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able balance the traffic in hybrid NoCs to achieve low-latency communication under various traffic loads. Simulation shows that, the proposed router can reduce applications’ execution time by an average of 16.9% compared to low-latency routers such as SWIFT. By reducing the latency between 2D routers (or wired nodes) and 3D routers (or wireless nodes) the proposed router can improve performance efficiency in terms of average packet delay by an average of 45% (or 50%) in 3D NoCs (or WiNoCs)

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

University of Northampton's Research Explorer

UCL Discovery

NECTAR

Balanced truncation for time-delay systems via approximate gramians

Author: Chen Q
Wang Q
Wang X
Wong N
Zhang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In circuit simulation, when a large RLC network is connected with delay elements, such as transmission lines, the resulting system is a time-delay system (TDS). This paper presents a new model order reduction (MOR) scheme for TDSs with state time delays. It is the first time to reduce a TDS using balanced truncation. The Lyapunov-type equations for TDSs are derived, and an analysis of their computational complexity is presented. To reduce the computational cost, we approximate the controllability and observability Gramians in the frequency domain. The reduced-order models (ROMs) are then obtained by balancing and truncating the approximate Gramians. Numerical examples are presented to verify the accuracy and efficiency of the proposed algorithm. ©2011 IEEE.published_or_final_versionThe 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011), Yokohama, Japan, 25-28 January 2011. In Proceedings of the 16th ASP-DAC, 2011, p. 55-60, paper 1C-

HKU Scholars Hub

A moment-matching scheme for the passivity-preserving model order reduction of indefinite descriptor systems with possible polynomial parts

Author: Daniel L
Wang Q
Wong N
Zhang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Passivity-preserving model order reduction (MOR) of descriptor systems (DSs) is highly desired in the simulation of VLSI interconnects and on-chip passives. One popular method is PRIMA, a Krylov-subspace projection approach which preserves the passivity of positive semidefinite (PSD) structured DSs. However, system passivity is not guaranteed by PRIMA when the system is indefinite. Furthermore, the possible polynomial parts of singular systems are normally not captured. For indefinite DSs, positive-real balanced truncation (PRBT) can generate passive reduced-order models (ROMs), whose main bottleneck lies in solving the dual expensive generalized algebraic Riccati equations (GAREs). This paper presents a novel moment-matching MORfor indefinite DSs, which preserves both the system passivity and, if present, also the improper polynomial part. This method only requires solving one GARE, therefore it is cheaper than existing PRBT schemes. On the other hand, the proposed algorithm is capable of preserving the passivity of indefinite DSs, which is not guaranteed by traditional moment-matching MORs. Examples are finally presented showing that our method is superior to PRIMA in terms of accuracy. ©2011 IEEE.published_or_final_versionThe 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011), Yokohama, Japan, 25-28 January 2011. In Proceedings of the 16th ASP-DAC, 2011, p. 49-54, paper 1C-

DSpace@MIT

Crossref

HKU Scholars Hub

Scheduling and Fluid Routing for Flow-Based Microfluidic Laboratories-on-a-Chip

Author: Brisk Philip
Madsen Jan
McDaniel Jeffrey
Minhass Wajid Hassan
Pop Paul
Raagaard Michael Lander
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Microfluidic laboratories-on-a-chip (LoCs) are replacing the conventional biochemical analyzers and are able to integrate the necessary functions for biochemical analysis on-chip. There are several types of LoCs, each having its advantages and limitations. In this paper we are interested in flow-based LoCs, in which a continuous flow of liquid is manipulated using integrated microvalves. By combining several microvalves, more complex units, such as micropumps, switches, mixers, and multiplexers, can be built. We consider that the architecture of the LoC is given, and we are interested in synthesizing an implementation, consisting of the binding of operations in the application to the functional units of the architecture, the scheduling of operations and the routing and scheduling of the fluid flows, such that the application completion time is minimized. To solve this problem, we propose a list scheduling-based application mapping (LSAM) framework and evaluate it by using real-life as well as synthetic benchmarks. When biochemical applications contain fluids that may adsorb on the substrate on which they are transported, the solution is to use rinsing operations for contamination avoidance. Hence, we also propose a rinsing heuristic, which has been integrated in the LSAM framework

Crossref

eScholarship - University of California

Online Research Database In Technology

МЕТОД УВЕЛИЧЕНИЯ СТАБИЛЬНОСТИ ФИЗИЧЕСКИ НЕКЛОНИРУЕМОЙ ФУНКЦИИ ТИПА «АРБИТР»

Author: A. A. Ivaniuk
S. S. Zalivaka
V. P. Klybik
А. А. Иванюк
В. П. Клыбик
С. С. Заливако
Publication venue: UIIP NASB
Publication date: 28/03/2017
Field of study

The paper presents a reliability enhancement method for an arbiter physically unclonable function (A-PUF). The proposed technique has reasonable challenge-response generation time and does not cause additional hardware overheads. A time difference of a test pulse delay has been used as a basis for A-PUF parametric model development. The proposed approach has been verified on a real programmable logic device.Предлагается метод повышения стабильности физически неклонируемой функции типа «арбитр» без увеличения затрат на аппаратное обеспечение и значительного роста времени получения ответа. Предлагается развернутая параметрическая модель формирования временной разницы тестового сигнала на входах арбитра. Проводится проверка метода на реальных устройствах программируемой логики

Informatics (E-Journal) / Информатика

Review of Display Technologies Focusing on Power Consumption

Author: González Alonso Ignacio
Rodríguez Fernández María
Zalama Casanova Eduardo
Publication venue: 'MDPI AG'
Publication date: 01/01/2015
Field of study

Producción CientíficaThis paper provides an overview of the main manufacturing technologies of displays, focusing on those with low and ultra-low levels of power consumption, which make them suitable for current societal needs. Considering the typified value obtained from the manufacturer’s specifications, four technologies—Liquid Crystal Displays, electronic paper, Organic Light-Emitting Display and Electroluminescent Displays—were selected in a first iteration. For each of them, several features, including size and brightness, were assessed in order to ascertain possible proportional relationships with the rate of consumption. To normalize the comparison between different display types, relative units such as the surface power density and the display frontal intensity efficiency were proposed. Organic light-emitting display had the best results in terms of power density for small display sizes. For larger sizes, it performs less satisfactorily than Liquid Crystal Displays in terms of energy efficiency.Junta de Castilla y León (Programa de apoyo a proyectos de investigación-Ref. VA036U14)Junta de Castilla y León (programa de apoyo a proyectos de investigación - Ref. VA013A12-2)Ministerio de Economía, Industria y Competitividad (Grant DPI2014-56500-R

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositorio Documental de la Universidad de Valladolid

Repositorio Institucional de la Universidad de Oviedo

Directory of Open Access Journals

Elastic circuits

Author: Carmona Vargas Josep
Cortadella Jordi
Kishinevsky Michael
Taubin Alexander
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Elasticity in circuits and systems provides tolerance to variations in computation and communication delays. This paper presents a comprehensive overview of elastic circuits for those designers who are mainly familiar with synchronous design. Elasticity can be implemented both synchronously and asynchronously, although it was traditionally more often associated with asynchronous circuits. This paper shows that synchronous and asynchronous elastic circuits can be designed, analyzed, and optimized using similar techniques. Thus, choices between synchronous and asynchronous implementations are localized and deferred until late in the design process.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Design methodology and productivity improvement in high speed VLSI circuits

Author: Hossain KM Mozammel
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Spring.Includes bibliographical references.To view the abstract, please see the full text of the document

Mountain Scholar (Digital Collections of Colorado and Wyoming)