Search CORE

8,202 research outputs found

Using MCD-DVS for dynamic thermal management performance improvement

Author: Chaparro Pedro
González Colás Antonio María
González González José
Magklis Grigorios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Architectural support for task dependence management with flexible software scheduling

Author: Beivide Palacio Ramon
Bosque Jose L.
Casas Marc
Castillo Emilio
Moreto Planas Miquel
Valero Cortés Mateo
Vallejo Enrique
Álvarez Martí Lluc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its programmability, portability and potential for optimizations. However, with the expected increase in core counts, finer-grained tasking will be required to exploit the available parallelism, which will increase the overheads introduced by the runtime system. This work presents Task Dependence Manager (TDM), a hardware/software co-designed mechanism to mitigate runtime system overheads. TDM introduces a hardware unit, denoted Dependence Management Unit (DMU), and minimal ISA extensions that allow the runtime system to offload costly dependence tracking operations to the DMU and to still perform task scheduling in software. With lower hardware cost, TDM outperforms hardware-based solutions and enhances the flexibility, adaptability and composability of the system. Results show that TDM improves performance by 12.3% and reduces EDP by 20.4% on average with respect to a software runtime system. Compared to a runtime system fully implemented in hardware, TDM achieves an average speedup of 4.2% with 7.3x less area requirements and significant EDP reductions. In addition, five different software schedulers are evaluated with TDM, illustrating its flexibility and performance gains.This work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the European HiPEAC Network of Excellence, by the Spanish Ministry of Science and Innovation (contracts TIN2015-65316-P, TIN2016-76635-C2-2-R and TIN2016-81840-REDT), by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), and by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671697 and No. 671610. M. Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Pixel Detectors

Author: Wermes Norbert
Publication venue
Publication date: 01/01/2005
Field of study

Pixel detectors for precise particle tracking in high energy physics have been developed to a level of maturity during the past decade. Three of the LHC detectors will use vertex detectors close to the interaction point based on the hybrid pixel technology which can be considered the state of the art in this field of instrumentation. A development period of almost 10 years has resulted in pixel detector modules which can stand the extreme rate and timing requirements as well as the very harsh radiation environment at the LHC without severe compromises in performance. From these developments a number of different applications have spun off, most notably for biomedical imaging. Beyond hybrid pixels, a number of monolithic or semi-monolithic developments, which do not require complicated hybridization but come as single sensor/IC entities, have appeared and are currently developed to greater maturity. Most advanced in terms of maturity are so called CMOS active pixels and DEPFET pixels. The present state in the construction of the hybrid pixel detectors for the LHC experiments together with some hybrid pixel detector spin-off is reviewed. In addition, new developments in monolithic or semi-monolithic pixel devices are summarized.Comment: 14 pages, 38 drawings/photographs in 21 figure

arXiv.org e-Print Archive

CERN Document Server

Recommended from our members

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI

Author: Alon E
Asanović K
Avizienis R
Bailey S
Blagojević M
Chen PH
Chiu PF
Flatresse P
Jevtić R
Keller B
Kwak J
Le HP
Lee Y
Nikolić B
Puggelli A
Richards B
Sutardja N
Waterman A
Zimmer B
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices

eScholarship - University of California

Trends in Pixel Detectors: Tracking and Imaging

Author: Wermes Norbert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

For large scale applications, hybrid pixel detectors, in which sensor and read-out IC are separate entities, constitute the state of the art in pixel detector technology to date. They have been developed and start to be used as tracking detectors and also imaging devices in radiography, autoradiography, protein crystallography and in X-ray astronomy. A number of trends and possibilities for future applications in these fields with improved performance, less material, high read-out speed, large radiation tolerance, and potential off-the-shelf availability have appeared and are momentarily matured. Among them are monolithic or semi-monolithic approaches which do not require complicated hybridization but come as single sensor/IC entities. Most of these are presently still in the development phase waiting to be used as detectors in experiments. The present state in pixel detector development including hybrid and (semi-)monolithic pixel techniques and their suitability for particle detection and for imaging, is reviewed.Comment: 10 pages, 15 figures, Invited Review given at IEEE2003, Portland, Oct, 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Cache Equalizer: A Cache Pressure Aware Block Placement Scheme for Large-Scale Chip Multiprocessors

Author: Cho Sangyeun
Hammoud Mohammad
Melhem Rami
Publication venue: Department of Computer Science, University of Pittsburgh
Publication date: 01/01/2009
Field of study

This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. CE provides Quality of Service (QoS) by robustly offering better performance than the baseline shared NUCA cache. Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. Furthermore, evaluations manifested the outperformance of CE versus related CMP cache designs

D-Scholarship@Pitt

A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

Author: Benini Luca
Farella Elisabetta
Rossi Davide
Rusci Manuele
Publication venue
Publication date: 01/01/2017
Field of study

This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast

128\mathrm{x}64

binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (

10\mu W

at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between

193\mu W

and

277\mu W

, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna