Search CORE

3 research outputs found

Active memory controller

Author: A Ailamaki
A Gottlieb
A Saulsbury
Ali Ibrahim
C Batten
C Cascaval
D Kim
D Patterson
DH Albonesi
DJ Sorin
DJ Sorin
DS Nikolopoulos
F Petrini
G Blelloch
G Marin
I Zotov
J Kuskin
J Laudon
J Torrellas
J Torrellas
JB Brockman
JH Ahn
JM Mellor-Crummey
John B. Carter
K Keeton
KM Chandy
L Zhang
L Zhang
L Zhao
LA Barroso
Lixin Zhang
M Garzaran
M Hall
M Hao
M Oskin
Michael A. Parker
P Kogge
PA Boncz
R Kalla
RE Kessler
S Chatterjee
S Kumar
S Scott
Sally A. McKee
T Anderson
T Eicken von
V Tipparaju
Xiaowei Jiang
Y Solihin
Y Solihin
Z Fang
Zhen Fang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Inability to hide main memory latency has been increasingly limiting the performance of modern processors. The problem is worse in large-scale shared memory systems, where remote memory latencies are hundreds, and soon thousands, of processor cycles. To mitigate this problem, we propose an intelligent memory and cache coherence controller (AMC) that can execute Active Memory Operations (AMOs). AMOs are select operations sent to and executed on the home memory controller of data. AMOs can eliminate a significant number of coherence messages, minimize intranode and internode memory traffic, and create opportunities for parallelism. Our implementation of AMOs is cache-coherent and requires no changes to the processor core or DRAM chips. In this paper, we present the microarchitecture design of AMC, and the programming model of AMOs. We compare AMOs\u27 performance to that of several other memory architectures on a variety of scientific and commercial benchmarks. Through simulation, we show that AMOs offer dramatic performance improvements for an important set of data-intensive operations, e.g., up to 50x faster barriers, 12x faster spinlocks, 8.5x-15x faster stream/array operations, and 3x faster database queries. We also present an analytical model that can predict the performance benefits of using AMOs with decent accuracy. The silicon cost required to support AMOs is less than 1% of the die area of a typical high performance processor, based on a standard cell implementation

Crossref

Chalmers Research

Active Memory Processor for Network-on-Chip Based Architecture

Author: Junhee Yoo
Kiyoung Choi
Yoo S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Memory-intensive operations and their memory access latency are often the performance bottleneck in parallel applications. In this paper, we investigate the concept of active memory operation which is an active data processing operation performed on the memory side. Utilizing the active memory operation, we can replace multiple transactions of memory accesses over the on-chip network and related computations on the processor side with a smaller number of high-level transactions and computations on the memory side. To realize the concept, we have designed a special-purpose processor called active memory processor which is tightly coupled with the memory and executes the active memory operations. In our case studies, we have applied the concept to five real-world applications (parallelized JPEG, FFT, text indexing for data mining, histogram, and eikonal equation solver) running on a 36-tile architecture with 64 cores and four memory tiles and found that the proposed approach can improve performance by 20.5 similar to 259.3 percent.X1136sciescopu

포항공과대학교