1,843 research outputs found
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Model quantization is a widely used technique to compress and accelerate deep
neural network (DNN) inference. Emergent DNN hardware accelerators begin to
support mixed precision (1-8 bits) to further improve the computation
efficiency, which raises a great challenge to find the optimal bitwidth for
each layer: it requires domain experts to explore the vast design space trading
off among accuracy, latency, energy, and model size, which is both
time-consuming and sub-optimal. Conventional quantization algorithm ignores the
different hardware architectures and quantizes all the layers in a uniform way.
In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ)
framework which leverages the reinforcement learning to automatically determine
the quantization policy, and we take the hardware accelerator's feedback in the
design loop. Rather than relying on proxy signals such as FLOPs and model size,
we employ a hardware simulator to generate direct feedback signals (latency and
energy) to the RL agent. Compared with conventional methods, our framework is
fully automated and can specialize the quantization policy for different neural
network architectures and hardware architectures. Our framework effectively
reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with
negligible loss of accuracy compared with the fixed bitwidth (8 bits)
quantization. Our framework reveals that the optimal policies on different
hardware architectures (i.e., edge and cloud architectures) under different
resource constraints (i.e., latency, energy and model size) are drastically
different. We interpreted the implication of different quantization policies,
which offer insights for both neural network architecture design and hardware
architecture design.Comment: CVPR 2019. The first three authors contributed equally to this work.
Project page: https://hanlab.mit.edu/projects/haq
Hardware-Centric AutoML for Mixed-Precision Quantization
Model quantization is a widely used technique to compress and accelerate deep
neural network (DNN) inference. Emergent DNN hardware accelerators begin to
support mixed precision (1-8 bits) to further improve the computation
efficiency, which raises a great challenge to find the optimal bitwidth for
each layer: it requires domain experts to explore the vast design space trading
off among accuracy, latency, energy, and model size, which is both
time-consuming and sub-optimal. Conventional quantization algorithm ignores the
different hardware architectures and quantizes all the layers in a uniform way.
In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ)
framework which leverages the reinforcement learning to automatically determine
the quantization policy, and we take the hardware accelerator's feedback in the
design loop. Rather than relying on proxy signals such as FLOPs and model size,
we employ a hardware simulator to generate direct feedback signals (latency and
energy) to the RL agent. Compared with conventional methods, our framework is
fully automated and can specialize the quantization policy for different neural
network architectures and hardware architectures. Our framework effectively
reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with
negligible loss of accuracy compared with the fixed bitwidth (8 bits)
quantization. Our framework reveals that the optimal policies on different
hardware architectures (i.e., edge and cloud architectures) under different
resource constraints (i.e., latency, energy, and model size) are drastically
different. We interpreted the implication of different quantization policies,
which offer insights for both neural network architecture design and hardware
architecture design.Comment: Journal preprint of arXiv:1811.08886 (IJCV, 2020). The first three
authors contributed equally to this work. Project page:
https://hanlab.mit.edu/projects/haq
Antioksidativna i citotoksična aktivnost ekstrakata mlijeka proizvedenog iz fermentirane crne soje
In this study, ethanol extracts from 2-day fermented black soybean milk (FBE) by immobilized Rhizopus oligosporus NTU5 have been evaluated for both antioxidant and cytotoxic activities. The results reveal that a 2-day FBE had strong 2,2-diphenyl-1-picrylhydrazyl (DPPH) scavenging effect (76 %). The extracts were further fractionated by silica gel column chromatography and an unknown compound, FBE5-A, was obtained, which exhibited strong antioxidant activity. IC50 of the DPPH scavenging effect of FBE5-A was 7.5 μg/mL, which is stronger than a commonly used antioxidant, vitamin E (α-tocopherol; 17.4 μg/mL), and similar to vitamin C (ascorbic acid; 7.6 μg/mL). The cytotoxic test demonstrated that extracts of 2-day fermented broth exhibited selective cytotoxic activity towards human carcinoma cells, Hep 3B (IC50=150.2 μg/mL), and did not affect normal human lung fibroblasts, MRC-5 (p<0.05). The results indicate the potential applications of fermented black soybean milk as functional food, pharmaceutical or cancer therapy formula.U radu je ispitana antioksidativna i citotoksična aktivnost etanolnih esktrakata mlijeka dobivenog nakon 2 dana fermentacije crne soje pomoću imobiliziranoga soja Rhizopus oligosporus NTU5. Rezultati su pokazali da takvo mlijeko ima izraženu sposobnost uklanjanja slobodnih DPPH radikala (76 %). Ekstrakti su mlijeka zatim frakcionirani kromatografijom na silikagelu i izdvojen je nepoznati sastojak, nazvan FBE5-A, izraženog antioksidativnog svojstva. Sposobnost toga sastojka da uklanja slobodne DPPH radikale kudikamo je veća (IC50=7,5 μg/mL) nego vitamina E, koji se obično koristi kao antioksidans (α-tokoferol, IC50=17,4 μg/mL) i približno jednaka vitaminu C (askorbinska kiselina, IC50=7,6 μg/mL). Ispitivanje citotoksičnog učinka FBE5-A pokazalo je da selektivno djeluje na ljudske stanice raka Hep 3B (IC50=150,2 μg/mL), a pritom ne utječe na normalne fibroblaste iz pluća MRC-5 (p<0,05). Rezultati upućuju na to da se mlijeko dobiveno fermentacijom crne soje može upotrijebiti kao funkcionalna hrana i za liječenje raznih bolesti, uključujući i rak
Poly[(μ6-benzene-1,3,5-tricarboxylato-κ6 O 1:O 1′:O 3:O 3′:O 5:O 5′)tris(N,N-dimethylformamide-κO)tris(μ3-formato-κ2 O:O′)trimagnesium(II)]
The title complex, [Mg3(CHO2)3(C9H3O6)(C3H7NO)3]n, exhibits a two-dimensional structure parallel to (001), which is built up from the MgII atoms and bridging carboxylate ligands (3 symmetry). The MgII atom is six-coordinated by one O atom from a dimethylformamide molecule, two O atoms from two μ6-benzene-1,3,5-tricarboxylate ligands and three O atoms from three μ3-formate ligands in a distorted octahedral geometry
Poly[diaqua-μ4-biphenyl-4,4′-dicarboxylato-magnesium(II)]
The solvothermal reaction of magnesium nitrate with biphenyl-4,4′-dicarboxylic acid in N,N-dimethylformamide and water leads to the formation of crystals of the title complex, [Mg(C14H8O4)(H2O)2]n. In the crystal structure, the Mg cations are coordinated by six O atoms from two water molecules and four symmetry-related biphenyl-4,4′-dicarboxylate anions within slightly distorted octahedra. The Mg cations are located on a center of inversion, the biphenyl-4,4′-dicarboxylate anions around a twofold rotation axis and the water molecule in a general position. The Mg cations are linked by the anions into a three-dimensional framework
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
We present APQ for efficient deep learning inference on resource-constrained
hardware. Unlike previous methods that separately search the neural
architecture, pruning policy, and quantization policy, we optimize them in a
joint manner. To deal with the larger design space it brings, a promising
approach is to train a quantization-aware accuracy predictor to quickly get the
accuracy of the quantized model and feed it to the search engine to select the
best fit. However, training this quantization-aware accuracy predictor requires
collecting a large number of quantized pairs, which involves
quantization-aware finetuning and thus is highly time-consuming. To tackle this
challenge, we propose to transfer the knowledge from a full-precision (i.e.,
fp32) accuracy predictor to the quantization-aware (i.e., int8) accuracy
predictor, which greatly improves the sample efficiency. Besides, collecting
the dataset for the fp32 accuracy predictor only requires to evaluate neural
networks without any training cost by sampling from a pretrained once-for-all
network, which is highly efficient. Extensive experiments on ImageNet
demonstrate the benefits of our joint optimization approach. With the same
accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ.
Compared to the separate optimization approach (ProxylessNAS+AMC+HAQ), APQ
achieves 2.3% higher ImageNet accuracy while reducing orders of magnitude GPU
hours and CO2 emission, pushing the frontier for green AI that is
environmental-friendly. The code and video are publicly available.Comment: Accepted by CVPR 202
- …