1,563 research outputs found
TDO-CIM: Transparent Detection and Offloading for Computation In-memory
Computation in-memory is a promising non-von Neumann approach aiming at
completely diminishing the data transfer to and from the memory subsystem.
Although a lot of architectures have been proposed, compiler support for such
architectures is still lagging behind. In this paper, we close this gap by
proposing an end-to-end compilation flow for in-memory computing based on the
LLVM compiler infrastructure. Starting from sequential code, our approach
automatically detects, optimizes, and offloads kernels suitable for in-memory
acceleration. We demonstrate our compiler tool-flow on the PolyBench/C
benchmark suite and evaluate the benefits of our proposed in-memory
architecture simulated in Gem5 by comparing it with a state-of-the-art von
Neumann architecture.Comment: Full version of DATE2020 publicatio
SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators
This paper presents a methodology for the Synthesis of PARallel multi-Threaded Accelerators (SPARTA) from OpenMP annotated C/C++ specifications. SPARTA extends an open-source HLS tool, enabling the generation of accelerators that provide latency tolerance for irregular memory accesses through multithreading, support fine-grained memory-level parallelism through a hot-potato deflection-based network-on-chip (NoC), support synchronization constructs, and can instantiate memory-side caches. Our approach is based on a custom runtime OpenMP library, providing flexibility and extensibility. Experimental results show high scalability when synthesizing irregular graph kernels. The accelerators generated with our approach are, on average, 2.29x faster than state-of-the-art HLS methodologies
- …