3 research outputs found

    A Synthesis Methodology for Application-Specific Logic-in-Memory Designs

    No full text
    ABSTRACT For deeply scaled digital integrated systems, the power required for transporting data between memory and logic can exceed the power needed for computation, thereby limiting the efficacy of synthesizing logic and compiling memory independently. Logic-in-Memory (LiM) architectures address this challenge by embedding logic within the memory block to perform basic operations on data locally for specific functions. While custom smart memories have been successfully constructed for various applications, a fully automated LiM synthesis flow enables architectural exploration that has heretofore not been possible. In this paper we present a tool and design methodology for LiM physical synthesis that performs co-design of algorithms and architectures to explore system level trade-offs. The resulting layouts and timing models can be incorporated within any physical synthesis tool. Silicon results shown in this paper demonstrate a 250x performance improvement and 310x energy savings for a data-intensive application example

    Accelerating Sparse Matrix-Matrix Multiplication with 3D-Stacked Logic-in-Memory Hardware

    No full text
    <p>This paper introduces a 3D-stacked logic-in-memory (LiM) system to accelerate the processing of sparse matrix data that is held in a 3D DRAM system. We build a customized content addressable memory (CAM) hardware structure to exploit the inherent sparse data patterns and model the LiM based hardware accelerator layers that are stacked in between DRAM dies for the efficient sparse matrix operations. Through silicon vias (TSVs) are used to provide the required high inter-layer bandwidth. Furthermore, we adapt the algorithm and data structure to fully leverage the underlying hardware capabilities, and develop the necessary design framework to facilitate the design space evaluation and LiM hardware synthesis. Our simulation demonstrates more than two orders of magnitude of performance and energy efficiency improvements compared with the traditional multithreaded software implementation on modern processors.</p
    corecore