1,053 research outputs found
Focused quantization for sparse CNNs
Deep convolutional neural networks (CNNs) are powerful tools for a wide range
of vision tasks, but the enormous amount of memory and compute resources
required by CNNs pose a challenge in deploying them on constrained devices.
Existing compression techniques, while excelling at reducing model sizes,
struggle to be computationally friendly. In this paper, we attend to the
statistical properties of sparse CNNs and present focused quantization, a novel
quantization strategy based on power-of-two values, which exploits the weight
distributions after fine-grained pruning. The proposed method dynamically
discovers the most effective numerical representation for weights in layers
with varying sparsities, significantly reducing model sizes. Multiplications in
quantized CNNs are replaced with much cheaper bit-shift operations for
efficient inference. Coupled with lossless encoding, we built a compression
pipeline that provides CNNs with high compression ratios (CR), low computation
cost and minimal loss in accuracy. In ResNet-50, we achieved a 18.08x CR with
only 0.24% loss in top-5 accuracy, outperforming existing compression methods.
We fully compressed a ResNet-18 and found that it is not only higher in CR and
top-5 accuracy, but also more hardware efficient as it requires fewer logic
gates to implement when compared to other state-of-the-art quantization methods
assuming the same throughput.This work is supported in part by the National Key R&D Program of China (No. 2018YFB1004804), the National Natural Science Foundation of China (No. 61806192). We thank EPSRC for providing Yiren Zhao his doctoral scholarship
Thermal Effects and Small Signal Modulation of 1.3-μm InAs/GaAs Self-Assembled Quantum-Dot Lasers
We investigate the influence of thermal effects on the high-speed performance of 1.3-μm InAs/GaAs quantum-dot lasers in a wide temperature range (5–50°C). Ridge waveguide devices with 1.1 mm cavity length exhibit small signal modulation bandwidths of 7.51 GHz at 5°C and 3.98 GHz at 50°C. Temperature-dependent K-factor, differential gain, and gain compression factor are studied. While the intrinsic damping-limited modulation bandwidth is as high as 23 GHz, the actual modulation bandwidth is limited by carrier thermalization under continuous wave operation. Saturation of the resonance frequency was found to be the result of thermal reduction in the differential gain, which may originate from carrier thermalization
An environmentally friendly solution-processed ZrLaO gate dielectric for large-area applications in the harsh radiation environment
In this work, an eco-friendly aqueous solution-processed ZrLaO dielectric is demonstrated for large-area application in the harsh radiation environment. Appropriate La doping (10% La) into ZrOx could suppress the formation of Vo and improve the InOx/ZrLaO interface. The Zr0.9La0.1Oy thin films remained stable under 144 krad (SiO2) gamma-ray irradiation, no distinct composition variation or property degradation were observed. The resistor-loaded inverter based on InOx/Zr0.9La0.1Oy TFT demonstrated full swing characteristics with a gain of 13.3 at 4 V and remained 91% gain after 103 krad (SiO2) irradiation
Automatic generation of multi-precision multi-arithmetic CNN accelerators for FPGAs
Modern deep Convolutional Neural Networks (CNNs) are computationally
demanding, yet real applications often require high throughput and low latency.
To help tackle these problems, we propose Tomato, a framework designed to
automate the process of generating efficient CNN accelerators. The generated
design is pipelined and each convolution layer uses different arithmetics at
various precisions. Using Tomato, we showcase state-of-the-art multi-precision
multi-arithmetic networks, including MobileNet-V1, running on FPGAs. To our
knowledge, this is the first multi-precision multi-arithmetic auto-generation
framework for CNNs. In software, Tomato fine-tunes pretrained networks to use a
mixture of short powers-of-2 and fixed-point weights with a minimal loss in
classification accuracy. The fine-tuned parameters are combined with the
templated hardware designs to automatically produce efficient inference
circuits in FPGAs. We demonstrate how our approach significantly reduces model
sizes and computation complexities, and permits us to pack a complete ImageNet
network onto a single FPGA without accessing off-chip memories for the first
time. Furthermore, we show how Tomato produces implementations of networks with
various sizes running on single or multiple FPGAs. To the best of our
knowledge, our automatically generated accelerators outperform closest
FPGA-based competitors by at least 2-4x for lantency and throughput; the
generated accelerator runs ImageNet classification at a rate of more than 3000
frames per second.EPSRC Doctoral Scholarship
Peterhouse Graduate Studentshi
Synthesis of chiral zinc porphyrin and its thermodynamic study of coordination reactions with substituted imidazoles
2000-2001 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
Improved ground-state modulation characteristics in 1.3 μm InAs/GaAs quantum dot lasers by rapid thermal annealing
We investigated the ground-state (GS) modulation characteristics of 1.3 μm InAs/GaAs quantum dot (QD) lasers that consist of either as-grown or annealed QDs. The choice of annealing conditions was determined from our recently reported results. With reference to the as-grown QD lasers, one obtains approximately 18% improvement in the modulation bandwidth from the annealed QD lasers. In addition, the modulation efficiency of the annealed QD lasers improves by approximately 45% as compared to the as-grown ones. The observed improvements are due to (1) the removal of defects which act as nonradiative recombination centers in the QD structure and (2) the reduction in the Auger-related recombination processes upon annealing
- …