Search CORE

263 research outputs found

A Comparative Test of \u3ci\u3eMelilotoides ruthenica\u3c/i\u3e Experimental Varieties

Author: Du Jiancai
Hu Huifang
Mao Xiaotao
Shi Wanguang
Wang Zhaolan
Zhang Yanyan
Zhao Lili
Zhu Feixue
Publication venue: UKnowledge
Publication date: 17/04/2021
Field of study

University of Kentucky

Extension of {NVNA} Baseband Measurement for {PA} Characterization Under Complex Modulation

Author: Guo Xiaotao
He Zhao
Humphreys David
Wang Lifeng
Zhang Yichi
Zhang Zilong
Zhao Wei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2018
Field of study

Crossref

Explore Bristol Research

DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data Capacity of SRAM-based Processing-In-Memory

Author: Duan Cenlin
He Xiaolin
He Ziyan
Jia Xiaotao
Pan Weitao
Qi Yingjie
Wang Xueyan
Wang Yikun
Wang Yiou
Yan Bonan
Yang Jianlei
Zhao Weisheng
Publication venue
Publication date: 31/10/2023
Field of study

Processing-in-memory (PIM), as a novel computing paradigm, provides significant performance benefits from the aspect of effective data movement reduction. SRAM-based PIM has been demonstrated as one of the most promising candidates due to its endurance and compatibility. However, the integration density of SRAM-based PIM is much lower than other non-volatile memory-based ones, due to its inherent 6T structure for storing a single bit. Within comparable area constraints, SRAM-based PIM exhibits notably lower capacity. Thus, aiming to unleash its capacity potential, we propose DDC-PIM, an efficient algorithm/architecture co-design methodology that effectively doubles the equivalent data capacity. At the algorithmic level, we propose a filter-wise complementary correlation (FCC) algorithm to obtain a bitwise complementary pair. At the architecture level, we exploit the intrinsic cross-coupled structure of 6T SRAM to store the bitwise complementary pair in their complementary states (

Q/\overline{Q}

), thereby maximizing the data capacity of each SRAM cell. The dual-broadcast input structure and reconfigurable unit support both depthwise and pointwise convolution, adhering to the requirements of various neural networks. Evaluation results show that DDC-PIM yields about

2.84\times

speedup on MobileNetV2 and

2.69\times

on EfficientNet-B0 with negligible accuracy loss compared with PIM baseline implementation. Compared with state-of-the-art SRAM-based PIM macros, DDC-PIM achieves up to

8.41\times

and

2.75\times

improvement in weight density and area efficiency, respectively.Comment: 14 pages, to be published in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD

arXiv.org e-Print Archive