101 research outputs found
RED: A ReRAM-based Deconvolution Accelerator
Deconvolution has been widespread in neural networks. For example, it is
essential for performing unsupervised learning in generative adversarial
networks or constructing fully convolutional networks for semantic
segmentation. Resistive RAM (ReRAM)-based processing-in-memory architecture has
been widely explored in accelerating convolutional computation and demonstrates
good performance. Performing deconvolution on existing ReRAM-based accelerator
designs, however, suffers from long latency and high energy consumption because
deconvolutional computation includes not only convolution but also extra add-on
operations. To realize the more efficient execution for deconvolution, we
analyze its computation requirement and propose a ReRAM-based accelerator
design, namely, RED. More specific, RED integrates two orthogonal methods, the
pixel-wise mapping scheme for reducing redundancy caused by zero-inserting
operations and the zero-skipping data flow for increasing the computation
parallelism and therefore improving performance. Experimental evaluations show
that compared to the state-of-the-art ReRAM-based accelerator, RED can speed up
operation 3.69x~1.15x and reduce 8%~88.36% energy consumption.Comment: 2019 Design, Automation & Test in Europe Conference & Exhibition
(DATE
Thermal Heating in ReRAM Crossbar Arrays: Challenges and Solutions
Increasing popularity of deep-learning-powered applications raises the issue
of vulnerability of neural networks to adversarial attacks. In other words,
hardly perceptible changes in input data lead to the output error in neural
network hindering their utilization in applications that involve decisions with
security risks. A number of previous works have already thoroughly evaluated
the most commonly used configuration - Convolutional Neural Networks (CNNs)
against different types of adversarial attacks. Moreover, recent works
demonstrated transferability of the some adversarial examples across different
neural network models. This paper studied robustness of the new emerging models
such as SpinalNet-based neural networks and Compact Convolutional Transformers
(CCT) on image classification problem of CIFAR-10 dataset. Each architecture
was tested against four White-box attacks and three Black-box attacks. Unlike
VGG and SpinalNet models, attention-based CCT configuration demonstrated large
span between strong robustness and vulnerability to adversarial examples.
Eventually, the study of transferability between VGG, VGG-inspired SpinalNet
and pretrained CCT 7/3x1 models was conducted. It was shown that despite high
effectiveness of the attack on the certain individual model, this does not
guarantee the transferability to other models.Comment: 18 page
CPSAA: Accelerating Sparse Attention using Crossbar-based Processing-In-Memory Architecture
The attention mechanism requires huge computational efforts to process
unnecessary calculations, significantly limiting the system's performance.
Researchers propose sparse attention to convert some DDMM operations to SDDMM
and SpMM operations. However, current sparse attention solutions introduce
massive off-chip random memory access. We propose CPSAA, a novel crossbar-based
PIM-featured sparse attention accelerator. First, we present a novel attention
calculation mode. Second, we design a novel PIM-based sparsity pruning
architecture. Finally, we present novel crossbar-based methods. Experimental
results show that CPSAA has an average of 89.6X, 32.2X, 17.8X, 3.39X, and 3.84X
performance improvement and 755.6X, 55.3X, 21.3X, 5.7X, and 4.9X energy-saving
when compare with GPU, FPGA, SANGER, ReBERT, and ReTransformer.Comment: 14 pages, 19 figure
Simulation and implementation of novel deep learning hardware architectures for resource constrained devices
Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems
- …