3 research outputs found

    Smart Memory Synthesis for Energy-Efficient Computed Tomography Reconstruction

    No full text
    <p>As nanoscale lithography challenges mandate greater pattern regularity and commonality for logic and memory circuits, new opportunities are created to affordably synthesize more powerful smart memory blocks for specific applications. Leveraging the ability to embed logic inside the memory block boundary, this paper demonstrates the synthesis of smart memory architectures that exploits the inherent memory address patterns of the back projection algorithm to enable efficient image reconstruction at minimum hardware overhead. An end-to-end design framework in sub-20nm CMOS technologies was constructed for the physical synthesis of smart memories and exploration of the huge design space. The experimental results show that customizing memory for the computerized tomography parallel back projection can achieve more than 30% area and power savings with marginal sacrifice of image accuracy.</p

    Accelerating Sparse Matrix-Matrix Multiplication with 3D-Stacked Logic-in-Memory Hardware

    No full text
    <p>This paper introduces a 3D-stacked logic-in-memory (LiM) system to accelerate the processing of sparse matrix data that is held in a 3D DRAM system. We build a customized content addressable memory (CAM) hardware structure to exploit the inherent sparse data patterns and model the LiM based hardware accelerator layers that are stacked in between DRAM dies for the efficient sparse matrix operations. Through silicon vias (TSVs) are used to provide the required high inter-layer bandwidth. Furthermore, we adapt the algorithm and data structure to fully leverage the underlying hardware capabilities, and develop the necessary design framework to facilitate the design space evaluation and LiM hardware synthesis. Our simulation demonstrates more than two orders of magnitude of performance and energy efficiency improvements compared with the traditional multithreaded software implementation on modern processors.</p

    Algorithm/Hardware Co-optimized SAR Image Reconstruction with 3D-stacked Logic in Memory

    No full text
    <p>Real-time system level implementations of complex Synthetic Aperture Radar (SAR) image reconstruction algorithms have always been challenging due to their data intensive characteristics. In this paper, we propose a basis vector transform based novel algorithm to alleviate the data intensity and a 3D-stacked logic in memory based hardware accelerator as the implementation platform. Experimental results indicate that this proposed algorithm/hardware co-optimized system can achieve an accuracy of 91 dB PSNR compared to a reference algorithm implemented in Matlab and energy efficiency of 72 GFLOPS/W for a 8k×8k SAR image reconstruction.</p
    corecore