Search CORE

1,843 research outputs found

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Author: Han Song
Lin Ji
Lin Yujun
Liu Zhijian
Wang Kuan
Publication venue
Publication date: 06/04/2019
Field of study

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer: it requires domain experts to explore the vast design space trading off among accuracy, latency, energy, and model size, which is both time-consuming and sub-optimal. Conventional quantization algorithm ignores the different hardware architectures and quantizes all the layers in a uniform way. In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ) framework which leverages the reinforcement learning to automatically determine the quantization policy, and we take the hardware accelerator's feedback in the design loop. Rather than relying on proxy signals such as FLOPs and model size, we employ a hardware simulator to generate direct feedback signals (latency and energy) to the RL agent. Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures. Our framework effectively reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with negligible loss of accuracy compared with the fixed bitwidth (8 bits) quantization. Our framework reveals that the optimal policies on different hardware architectures (i.e., edge and cloud architectures) under different resource constraints (i.e., latency, energy and model size) are drastically different. We interpreted the implication of different quantization policies, which offer insights for both neural network architecture design and hardware architecture design.Comment: CVPR 2019. The first three authors contributed equally to this work. Project page: https://hanlab.mit.edu/projects/haq

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Exploiting Inter-Sample Information and Exploring Visualization in Data Mining: from Bioinformatics to Anthropology and Aesthetics Disciplines

Author: Jung-Hua Liu
Kuan-Ming Lin
Publication venue: 'IntechOpen'
Publication date: 21/01/2011
Field of study

IntechOpen

Hardware-Centric AutoML for Mixed-Precision Quantization

Author: Han Song
Lin Ji
Lin Yujun
Liu Zhijian
Wang Kuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/08/2020
Field of study

Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer: it requires domain experts to explore the vast design space trading off among accuracy, latency, energy, and model size, which is both time-consuming and sub-optimal. Conventional quantization algorithm ignores the different hardware architectures and quantizes all the layers in a uniform way. In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ) framework which leverages the reinforcement learning to automatically determine the quantization policy, and we take the hardware accelerator's feedback in the design loop. Rather than relying on proxy signals such as FLOPs and model size, we employ a hardware simulator to generate direct feedback signals (latency and energy) to the RL agent. Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures. Our framework effectively reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with negligible loss of accuracy compared with the fixed bitwidth (8 bits) quantization. Our framework reveals that the optimal policies on different hardware architectures (i.e., edge and cloud architectures) under different resource constraints (i.e., latency, energy, and model size) are drastically different. We interpreted the implication of different quantization policies, which offer insights for both neural network architecture design and hardware architecture design.Comment: Journal preprint of arXiv:1811.08886 (IJCV, 2020). The first three authors contributed equally to this work. Project page: https://hanlab.mit.edu/projects/haq

arXiv.org e-Print Archive

DSpace@MIT

Antioksidativna i citotoksična aktivnost ekstrakata mlijeka proizvedenog iz fermentirane crne soje

Author: Jiun-Tsai Lin
Kuan-Chen Cheng
Wen-Hsiung Liu
Publication venue: Faculty of Food Technology and Biotechnology, University of Zagreb
Publication date: 01/01/2011
Field of study

In this study, ethanol extracts from 2-day fermented black soybean milk (FBE) by immobilized Rhizopus oligosporus NTU5 have been evaluated for both antioxidant and cytotoxic activities. The results reveal that a 2-day FBE had strong 2,2-diphenyl-1-picrylhydrazyl (DPPH) scavenging effect (76 %). The extracts were further fractionated by silica gel column chromatography and an unknown compound, FBE5-A, was obtained, which exhibited strong antioxidant activity. IC50 of the DPPH scavenging effect of FBE5-A was 7.5 μg/mL, which is stronger than a commonly used antioxidant, vitamin E (α-tocopherol; 17.4 μg/mL), and similar to vitamin C (ascorbic acid; 7.6 μg/mL). The cytotoxic test demonstrated that extracts of 2-day fermented broth exhibited selective cytotoxic activity towards human carcinoma cells, Hep 3B (IC50=150.2 μg/mL), and did not affect normal human lung fibroblasts, MRC-5 (p<0.05). The results indicate the potential applications of fermented black soybean milk as functional food, pharmaceutical or cancer therapy formula.U radu je ispitana antioksidativna i citotoksična aktivnost etanolnih esktrakata mlijeka dobivenog nakon 2 dana fermentacije crne soje pomoću imobiliziranoga soja Rhizopus oligosporus NTU5. Rezultati su pokazali da takvo mlijeko ima izraženu sposobnost uklanjanja slobodnih DPPH radikala (76 %). Ekstrakti su mlijeka zatim frakcionirani kromatografijom na silikagelu i izdvojen je nepoznati sastojak, nazvan FBE5-A, izraženog antioksidativnog svojstva. Sposobnost toga sastojka da uklanja slobodne DPPH radikale kudikamo je veća (IC50=7,5 μg/mL) nego vitamina E, koji se obično koristi kao antioksidans (α-tokoferol, IC50=17,4 μg/mL) i približno jednaka vitaminu C (askorbinska kiselina, IC50=7,6 μg/mL). Ispitivanje citotoksičnog učinka FBE5-A pokazalo je da selektivno djeluje na ljudske stanice raka Hep 3B (IC50=150,2 μg/mL), a pritom ne utječe na normalne fibroblaste iz pluća MRC-5 (p<0,05). Rezultati upućuju na to da se mlijeko dobiveno fermentacijom crne soje može upotrijebiti kao funkcionalna hrana i za liječenje raznih bolesti, uključujući i rak

HRČAK - Portal of Croatian Scientific and Professional Journals

Poly[(μ6-benzene-1,3,5-tricarboxylato-κ6 O 1:O 1′:O 3:O 3′:O 5:O 5′)tris(N,N-dimethylformamide-κO)tris(μ3-formato-κ2 O:O′)trimagnesium(II)]

Author: Chia-Her Lin
Chia-Jing Lin
Chun-Ting Yeh
He
Hsin-Kuan Liu
Kitagawa
Liu
Sheldrick
Publication venue: International Union of Crystallography
Publication date: 01/10/2010
Field of study

The title complex, [Mg3(CHO2)3(C9H3O6)(C3H7NO)3]n, exhibits a two-dimensional structure parallel to (001), which is built up from the MgII atoms and bridging carboxylate ligands (3 symmetry). The MgII atom is six-coordinated by one O atom from a dimethylformamide molecule, two O atoms from two μ6-benzene-1,3,5-tricarboxylate ligands and three O atoms from three μ3-formate ligands in a distorted octahedral geometry

Crossref

Directory of Open Access Journals

PubMed Central

Poly[diaqua-μ4-biphenyl-4,4′-dicarboxylato-magnesium(II)]

Author: Chia-Her Lin
Hsin-Kuan Liu
Kitagawa
Sheldrick
Xiang-Wen Peng
Publication venue: International Union of Crystallography
Publication date: 01/02/2009
Field of study

The solvothermal reaction of magnesium nitrate with biphenyl-4,4′-dicarboxylic acid in N,N-dimethylformamide and water leads to the formation of crystals of the title complex, [Mg(C14H8O4)(H2O)2]n. In the crystal structure, the Mg cations are coordinated by six O atoms from two water molecules and four symmetry-related biphenyl-4,4′-dicarboxylate anions within slightly distorted octahedra. The Mg cations are located on a center of inversion, the biphenyl-4,4′-dicarboxylate anions around a twofold rotation axis and the water molecule in a general position. The Mg cations are linked by the anions into a three-dimensional framework

Crossref

Directory of Open Access Journals

PubMed Central

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Author: Cai Han
Han Song
Lin Ji
Liu Zhijian
Wang Kuan
Wang Tianzhe
Publication venue
Publication date: 15/06/2020
Field of study

We present APQ for efficient deep learning inference on resource-constrained hardware. Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner. To deal with the larger design space it brings, a promising approach is to train a quantization-aware accuracy predictor to quickly get the accuracy of the quantized model and feed it to the search engine to select the best fit. However, training this quantization-aware accuracy predictor requires collecting a large number of quantized pairs, which involves quantization-aware finetuning and thus is highly time-consuming. To tackle this challenge, we propose to transfer the knowledge from a full-precision (i.e., fp32) accuracy predictor to the quantization-aware (i.e., int8) accuracy predictor, which greatly improves the sample efficiency. Besides, collecting the dataset for the fp32 accuracy predictor only requires to evaluate neural networks without any training cost by sampling from a pretrained once-for-all network, which is highly efficient. Extensive experiments on ImageNet demonstrate the benefits of our joint optimization approach. With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ. Compared to the separate optimization approach (ProxylessNAS+AMC+HAQ), APQ achieves 2.3% higher ImageNet accuracy while reducing orders of magnitude GPU hours and CO2 emission, pushing the frontier for green AI that is environmental-friendly. The code and video are publicly available.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

DSpace@MIT