Search CORE

6 research outputs found

Wafer-Scale Fast Fourier Transforms

Author: Chetlur Sharan
Jacquelin Mathias
Orenes-Vera Marcelo
Schreiber Robert
Sharapov Ilya
Vandermersch Philippe
Publication venue
Publication date: 29/09/2022
Field of study

We have implemented fast Fourier transforms for one, two, and three-dimensional arrays on the Cerebras CS-2, a system whose memory and processing elements reside on a single silicon wafer. The wafer-scale engine (WSE) encompasses a two-dimensional mesh of roughly 850,000 processing elements (PEs) with fast local memory and equally fast nearest-neighbor interconnections. Our wafer-scale FFT (wsFFT) parallelizes a

n^3

problem with up to

n^2

PEs. At this point a PE processes only a single vector of the 3D domain (known as a pencil) per superstep, where each of the three supersteps performs FFT along one of the three axes of the input array. Between supersteps, wsFFT redistributes (transposes) the data to bring all elements of each one-dimensional pencil being transformed into the memory of a single PE. Each redistribution causes an all-to-all communication along one of the mesh dimensions. Given the level of parallelism, the size of the messages transmitted between pairs of PEs can be as small as a single word. In theory, a mesh is not ideal for all-to-all communication due to its limited bisection bandwidth. However, the mesh interconnecting PEs on the WSE lies entirely on-wafer and achieves nearly peak bandwidth even with tiny messages. This high efficiency on fine-grain communication allow wsFFT to achieve unprecedented levels of parallelism and performance. We analyse in detail computation and communication time, as well as the weak and strong scaling, using both FP16 and FP32 precision. With 32-bit arithmetic on the CS-2, we achieve 959 microseconds for 3D FFT of a

512^3

complex input array using a 512x512 subgrid of the on-wafer PEs. This is the largest ever parallelization for this problem size and the first implementation that breaks the millisecond barrier

arXiv.org e-Print Archive

Flynn’s Reconciliation

Author: Anzt Hartwig
Applegate David
Beckingsale David A.
Chetlur Sharan
Joyner David
Karrenberg Ralf
Kessenich John
Lattner Chris
Leißa Roland
Molina Alejandro
Pharr Matt
Shin Jaewook
Thuerck Daniel
Wang Endong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

ApproxHPVM: a portable compiler IR for accuracy-aware optimizations

Author: Chen Tianqi
Chetlur Sharan
Gulli Antonio
Howard Andrew G.
Krizhevsky Alex
LeCun Yann
LeCun Yann
Lin Darryl D.
Micikevicius Paulius
Misailovic Sasa
NVIDIA.
Rotem Nadav
Sakr Charbel
Sampson Adrian
Simonyan Karen
Stanley-Marbell Phillip
Team The XLA
Xu Ran
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Deep Learning for Computer Architects

Author: Abadi Martín
Bahdanau Dzmitry
Bellemare M. G.
Bergstra James
Brandon Reagen
Chen Wenlin
Chen Wenlin
Chetlur Sharan
Chollet François
Ciresan Dan C.
Collobert Ronan
Courbariaux Matthieu
Craven Mark
David Brooks
Dean Jeffrey
Denil Misha
Diederik
Garofolo John S.
Glorot Xavier
Glorot Xavier
Grosse R.
Gu-Yeon Wei
Gupta Suyog
Gupta Suyog
Han Song
Han Song
Hannun Awni
Hernández-Lobato José Miguel
Hinton Geoffrey E.
Hinton Geoffrey E.
Hochreiter Sepp
Iandola Forrest N.
Ioffe Sergey
Joachims Thorsten
Karpathy Andrej
Lin Darryl Dexu
Lin Min
Lin Zhouhan
Minsky Marvin
Minsky Marvin
Mnih Volodymyr
Ott Joachim
Paul Whatmough
Poultney Christopher
Reagen Brandon
Robert Adolf
Rosenblatt Frank
Rosenblatt Frank
Simonyan Karen
Sindhwani Vikas
Snoek Jasper
Srivastava Nitish
Strobelt Hendrik
Sukhbaatar Sainbayar
Sung Wonyong
Sutskever Ilya
Wan Li
Werbos P. J.
Weston Jason
Weston Jason
Publication venue: 'Morgan & Claypool Publishers LLC'
Publication date
Field of study

Analyzing Analytics

Author: (e StatSoft Inc.
Agrawal Rakesh
Algorithms
Analytics
Anderson David R.
Andrew
Andrew Ng.
Apt K. R.
Arnold Oliver
Barabasi Albert-Laszlo
Basic
Basic
Basic
Basic
Baskerville Kim
Berry Algorithms
Bhasker Bharat
Bhattacharya SAP
Blelloch Guy E.
Bob Blainey
Bordawekar Rajesh
Box George
Boyd Stephen
Breese J.S.
Breiman L.
Canny John
Chetlur Sharan
Chilimbi Trishul
Christian de Schryver Henning Marxen
Christos
Ciaccia P.
Collobert Ronan
Computation Centre Cork Constraint
Crosbie Peter
Davenport T.
Davenport T.
Dechter Rina
Delivirias Christos
Dempster A. P.
Eckerson Wayne W.
Economist e
Economist e
Falk Michael
Feldman Daniel J.
Ferrucci David
Fitch Blake
Fletcher Tristan
Forsyth Peter
Foundation COIN-OR
Fricker Regis
Fu
Glaser William T.
Good
Good N.
Goode Erica
Hall Weka
Hall Weka
Han Growth
Han Jiawei
Hannun Awni Y.
Harary Frank
Hebb Donald O.
Henschel Robert
Hetherington Rick
Holder A.
Hsu Chih-Wei
IBM
IBM
IBM
IBM Corp.
IBM Institute for Business Value.
Idea Basic
Idea Basic
Inc.
Intel Corp.
Issue Science Special
Joyce John
Kanjani [2007] Lee
Kanjani Khushboo
Kobayashi Masayoshi
Kohonen Teuvo
Kohonen Teuvo
Kopser
Korn Ralf
Kotsiantis S. B.
Kriesel David
Krizhevsky Alex
Lahiri Tirthankar
Lahlou Hicham
Lakkaraju Himabindu
Landauer omas K.
LeCun Y.
Lee Daniel
Legler omas
Leskovec Jure
Lim QUEST
Liu Spill Trees
Luk Wayne
MacQueen J.
Madsen
Madsen Mark
Maltby Jim
Manning Christopher D.
Manyika James
McCallum Andrew
Melville P.
Melville Prem
Mikolov Tomas
Miner
Miner (Oracle Corp. [b] and SAP Legler et al.
Mishra Asit K.
Nanavati Amit A.
Neville Padraic G.
Newman M. E. J.
Ng
Ng
NIPS
Nisbet Robert
Niu Feng
Omohundro S. M.
Omohundro S. M.
ompson Henry
Ouyang Jian
Ovtcharov Kalin
Packages
Papamanousakis Grigorios
Porter Mason A.
Quinlan J. Ross
Rajesh Bordawekar
Rexer Karl
Richter Yossi
Riedel Erik
Roesch M.
Ruchir Puri
Sahami M.
SAP Inc.
Sarle Warren S.
Savasere Ashok
Sawilowsky Shlomo S.
Schlegel
Schlegel Benjamin
Schrijver Alexander
Schrijver Alexander
Shih S.
Shmueli Galit
SQL Server (Microsoft Corp.
STATISTICA (e StatSoft Inc.
Strehl Alexander
Strohm Peter
Studnitzer Ari
Suman Ambika
Szegedy Christian
Szegedy Google
TICA (e StatSoft Inc.
Toint
Tsang Ivor W.
van den Berg Ewout
Vapnik Vladimir
Wang Jingdong
Week Business
World Wide Web Consortium
World Wide Web Consortium
Wu Ren
Yi Youngmin
Zhang Yongpeng
Publication venue: 'Morgan & Claypool Publishers LLC'
Publication date
Field of study

core

core