Search CORE

352 research outputs found

Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-Temporal Sparsity

Author: Delbruck Tobi
Gao Chang
Liu Shih-Chii
Publication venue
Publication date: 29/03/2022
Field of study

Long Short-Term Memory (LSTM) recurrent networks are frequently used for tasks involving time-sequential data such as speech recognition. Unlike previous LSTM accelerators that either exploit spatial weight sparsity or temporal activation sparsity, this paper proposes a new accelerator called "Spartus" that exploits spatio-temporal sparsity to achieve ultralow latency inference. Spatial sparsity is induced using a new Column-Balanced Targeted Dropout (CBTD) structured pruning method, which produces structured sparse weight matrices for balanced workloads. The pruned networks running on Spartus hardware achieve weight sparsity of up to 96% and 94% with negligible accuracy loss on the TIMIT and the Librispeech datasets. To induce temporal sparsity in LSTM, we extend the previous DeltaGRU method to the DeltaLSTM method. Combining spatio-temporal sparsity with CBTD and DeltaLSTM saves on weight memory access and associated arithmetic operations. The Spartus architecture is scalable and supports real-time online speech recognition when implemented on small and large FPGAs. Spartus per-sample latency for a single DeltaLSTM layer of 1024 neurons averages 1 us. Exploiting spatio-temporal sparsity leads to 46X speedup of Spartus over its theoretical hardware performance to achieve 9.4 TOp/s effective batch-1 throughput and 1.1 TOp/s/W power efficiency.Comment: Preprint. Under revie

arXiv.org e-Print Archive

Spartus: A 9.4 TOp/s FPGA-Based LSTM Accelerator Exploiting Spatio-Temporal Sparsity

Author: Delbruck Tobi
Gao Chang
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

ZORA

An area-efficient ultra-low-power time-domain feature extractor for edge keyword spotting

Author: Chang Y
Chen Qinyu
Gao Chang
Kim Kwantae
Liu Shih-Chii
Publication venue
Publication date: 25/05/2023
Field of study

ZORA

Intrinsic sparse LSTM using structured targeted dropout for efficient hardware inference

Author: Gao Chang
Lindmar Johanna Hedlund
Liu Shih-Chii
Publication venue
Publication date: 15/06/2022
Field of study

Recurrent Neural Networks (RNNs) are useful for speech recognition but their fully-connected structure leads to a large memory footprint, making it difficult to deploy them on resource-constrained embedded systems. Previous structured RNN pruning methods can effectively reduce RNN size; however, it is difficult to find a good balance between high sparsity and high task accuracy or the pruned models only lead to moderate speedup on custom hardware accelerators. This work proposes a novel structured pruning method called Structure Targeted Dropout (STD)-Intrinsic Sparse Structures (ISS) that stochastically drops grouped rows and columns of the weight matrices during training. The compressed networks are equivalent to a smaller dense network, which can be efficiently processed by Graphics Processing Units (GPUs). STD-ISS is evaluated on the TIMIT phone recognition task using Long Short-Term Memory (LSTM) RNNs. It outperforms previous state-of-the-art hardware-friendly methods on both accuracy and compression ratio. STD-ISS achieves a size compression ratio of up to 50× with <1% accuracy loss, leading to a 19.1× speedup on the embedded Jetson Xavier NX GPU platform

ZORA

Directed diffraction without negative refraction

Author: A. Yariv
Chao-Hsien Kuo
Chii-Chang Chen
Hui-Ting Tang
Hung-Ta Chien
Zhen Ye
Publication venue: 'American Physical Society (APS)'
Publication date: 22/12/2003
Field of study

Using the FDTD method, we investigate the electromagnetic propagation in two-dimensional photonic crystals, formed by parallel air cylinders in a dielectric medium. The corresponding frequency band structure is computed using the standard plane-wave expansion method. It is shown that within partial bandgaps, waves tend to bend away from the forbidden directions. This phenomenon perhaps need not be explained in terms of negative refraction or `superlensing' behavior, contrast to what has been conjectured.Comment: 3 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Energy-efficient activity-driven computing architectures for edge intelligence

Author: Delbruck Tobi
Gao Chang
Kim Kwantae
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2022
Field of study

ZORA