Search CORE

16 research outputs found

System-level Prototyping with HyperTransport

Author: Flanagan Kelly
Watson Myles
Publication venue
Publication date: 01/01/2011
Field of study

The complexity of computer systems continues to increase. Emulation of proposed subsystems is one way to manage this growing complexity when evaluating the performance of proposed architectures. HyperTransport allows researchers to connect directly to microprocessors with FPGAs. This enables the emulation of novel memory hierarchies, non-volatile memory designs, coprocessors, and other architectural changes, combined with an existing system

Heidelberger Dokumentenserver

Performance impact of a slower main memory: a case study of STT-MRAM in HPC

Author: Asifuzzaman Kazi
Kwon Ohseong
Pavlovic Milan
Radojković Petar
Radulović Milan
Ryoo Kyung-Chang
Zaragoza David
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2016
Field of study

In high-performance computing (HPC), significant effort is invested in research and development of novel memory technologies. One of them is Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) --- byte-addressable, high-endurance non-volatile memory with slightly higher access time than DRAM. In this study, we conduct a preliminary assessment of HPC system performance impact with STT-MRAM main memory with recent industry estimations. Reliable timing parameters of STT-MRAM devices are unavailable, so we also perform a sensitivity analysis that correlates overall system slowdown trend with respect to average device latency. Our results demonstrate that the overall system performance of large HPC clusters is not particularly sensitive to main-memory latency. Therefore, STT-MRAM, as well as any other emerging non-volatile memories with comparable density and access time, can be a viable option for future HPC memory system design.This work was supported by the Collaboration Agreement between Samsung Electronics Co., Ltd. and BSC, Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project and by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272). This work has also received funding from the European Union's Horizon 2020 research and innovation programme under ExaNoDe project (grant agreement No 671578).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

NVRAM as an enabler to new horizons in graph processing

Author: Brown Nicholas
Bull Jonathan Mark
Capelli Ludovic Anthony Richard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2022
Field of study

Edinburgh Research Explorer

Exposing the Locality of Heterogeneous Memory Architectures to HPC Applications

Author: Goglin Brice
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/10/2016
Field of study

International audienceHigh-performance computing requires a deep knowledge of the hardware platform to fully exploit its computing power. The performance of data transfer between cores and memory is becoming critical. Therefore locality is a major area of optimization on the road to exascale. Indeed, tasks and data have to be carefully distributed on the computing and memory resources.We discuss the current way to expose processor and memory locality information in the Linux kernel and in user-space libraries such as the hwloc software project. The current de facto standard structural modeling of the platform as the tree is not perfect, but it offers a good compromise between precision and convenience for HPC runtimes.We present an in-depth study of the software view of the upcoming Intel Knights Landing processor. Its memory locality cannot be properly exposed to user-space applications without a significant rework of the current software stack. We propose an extension of the current hierarchical platform model in hwloc. It correctly exposes new heterogeneous architectures with high-bandwidth or non-volatile memories to applications, while still being convenient for affinity-aware HPC runtimes

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

An Early Evaluation of Intel’s Optane DC Persistent Memory Module and its Impact on High-Performance Scientific Applications

Author: Bonanni Antonino
Brunst Holger
Herold Christian
Iffrig Olivier
Jackson Adrian
Johnson Nick
Parsons Mark
Quintino Tiago
Smart Simon
Weiland Michele
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/11/2019
Field of study

Crossref

Edinburgh Research Explorer

Proceedings of the Second International Workshop on HyperTransport Research and Applications (WHTRA2011)

Author
Publication venue
Publication date: 01/01/2011
Field of study

Proceedings of the Second International Workshop on HyperTransport Research and Applications (WHTRA2011) which was held Feb. 9th 2011 in Mannheim, Germany. The Second International Workshop for Research on HyperTransport is an international high quality forum for scientists, researches and developers working in the area of HyperTransport. This includes not only developments and research in HyperTransport itself, but also work which is based on or enabled by HyperTransport. HyperTransport (HT) is an interconnection technology which is typically used as system interconnect in modern computer systems, connecting the CPUs among each other and with the I/O bridges. Primarily designed as interconnect between high performance CPUs it provides an extremely low latency, high bandwidth and excellent scalability. The definition of the HTX connector allows the use of HT even for add-in cards. In opposition to other peripheral interconnect technologies like PCI-Express no protocol conversion or intermediate bridging is necessary. HT is a direct connection between device and CPU with minimal latency. Another advantage is the possibility of cache coherent devices. Because of these properties HT is of high interest for high performance I/O like networking and storage, but also for co-processing and acceleration based on ASIC or FPGA technologies. In particular acceleration sees a resurgence of interest today. One reason is the possibility to reduce power consumption by the use of accelerators. In the area of parallel computing the low latency communication allows for fine grain communication schemes and is perfectly suited for scalable systems. Summing up, HT technology offers key advantages and great performance to any research aspect related to or based on interconnects. For more information please consult the workshop website (http://whtra.uni-hd.de)

Heidelberger Dokumentenserver