Search CORE

564,835 research outputs found

How Multithreading Addresses the Memory Wall

Author: Machanick Philip
Publication venue
Publication date: 01/12/2002
Field of study

The memory wall is the predicted situation where improvements to processor speed will be masked by the much slower improvement in dynamic random access (DRAM) memory speed. Since the prediction was made in 1995, considerable progress has been made in addressing the memory wall. There have been advances in DRAM organization, improved approaches to memory hierarchy have been proposed, integrating DRAM onto the processor chip has been investigated and alternative approaches to organizing the instruction stream have been researched. All of these approaches contribute to reducing the predicted memory wall effect; some can potentially be combined. This paper reviews several approaches with a view to assessing the most promising option. Given the growing CPU-DRAM speed gap, any strategy which finds alternative work while waiting for DRAM is likely to be a win

University of Queensland eSpace

Kilo-instruction processors: overcoming the memory wall

Author: Cazorla Francisco
Cristal Kestelman Adrián
Galluzzi Marco
Pericas Miquel
Ramirez Garcia Tanausú
Santana Jaria Oliverio J.
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Historically, advances in integrated circuit technology have driven improvements in processor microarchitecture and led to todays microprocessors with sophisticated pipelines operating at very high clock frequencies. However, performance improvements achievable by high-frequency microprocessors have become seriously limited by main-memory access latencies because main-memory speeds have improved at a much slower pace than microprocessor speeds. Its crucial to deal with this performance disparity, commonly known as the memory wall, to enable future high-frequency microprocessors to achieve their performance potential. To overcome the memory wall, we propose kilo-instruction processors-superscalar processors that can maintain a thousand or more simultaneous in-flight instructions. Doing so means designing key hardware structures so that the processor can satisfy the high resource requirements without significantly decreasing processor efficiency or increasing energy consumption.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

When parallel speedups hit the memory wall

Author: Eder Kerstin
Furtunato Alex F. A.
Georgiou Kyriakos
Xavier-de-Souza Samuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible configurations of operating frequency and number of cores that current architectures can offer, suitable speedup models to describe such variations among these configurations are quite desirable for off-line or on-line scheduling decisions. This work proposes new parallel speedup models that account for variations of the average data-access delay to describe the limiting effect of the memory wall on parallel speedups. Analytical results indicate that the proposed modeling can capture the desired behavior while experimental hardware results validate the former. Additionally, we show that when accounting for parameters that reflect the intrinsic characteristics of the applications, such as degree of parallelism and susceptibility to the memory wall, our proposal has significant advantages over machine-learning-based modeling. Moreover, besides being black-box modeling, our experiments show that conventional machine-learning modeling needs about one order of magnitude more measurements to reach the same level of accuracy achieved in our modeling.Comment: 24 page

arXiv.org e-Print Archive

Explore Bristol Research

Magnetoelectric domain wall dynamics and its implications for magnetoelectric memory

Author: Belashchenko K. D.
Kovalev Alexey A.
Tchernyshyov O.
Tretiakov O. A.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2016
Field of study

Domain wall dynamics in a magnetoelectric antiferromagnet is analyzed, and its implications for magnetoelectric memory applications are discussed. Cr

_2

_3

is used in the estimates of the materials parameters. It is found that the domain wall mobility has a maximum as a function of the electric field due to the gyrotropic coupling induced by it. In Cr

_2

_3

the maximal mobility of 0.1 m/(s

\times

Oe) is reached at

E\approx0.06

V/nm. Fields of this order may be too weak to overcome the intrinsic depinning field, which is estimated for B-doped Cr

_2

_3

. These major drawbacks for device implementation can be overcome by applying a small in-plane shear strain, which blocks the domain wall precession. Domain wall mobility of about 0.7 m/(s

\times

Oe) can then be achieved at

E=0.2

V/nm. A split-gate scheme is proposed for the domain-wall controlled bit element; its extension to multiple-gate linear arrays can offer advantages in memory density, programmability, and logic functionality.Comment: 5 pages, 2 figures, revised and corrected version, accepted in Applied Physics Letter

arXiv.org e-Print Archive

Crossref

DigitalCommons@University of Nebraska