790 research outputs found
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Vision Transformers (ViTs) have achieved state-of-the-art performance on
various computer vision applications. These models, however, have considerable
storage and computational overheads, making their deployment and efficient
inference on edge devices challenging. Quantization is a promising approach to
reducing model complexity; unfortunately, existing efforts to quantize ViTs are
simulated quantization (aka fake quantization), which remains floating-point
arithmetic during inference and thus contributes little to model acceleration.
In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs,
to enable ViTs to perform the entire computational graph of inference with
integer operations and bit-shifting and no floating-point operations. In I-ViT,
linear operations (e.g., MatMul and Dense) follow the integer-only pipeline
with dyadic arithmetic, and non-linear operations (e.g., Softmax, GELU, and
LayerNorm) are approximated by the proposed light-weight integer-only
arithmetic methods. In particular, I-ViT applies the proposed Shiftmax and
ShiftGELU, which are designed to use integer bit-shifting to approximate the
corresponding floating-point operations. We evaluate I-ViT on various benchmark
models and the results show that integer-only INT8 quantization achieves
comparable (or even higher) accuracy to the full-precision (FP) baseline.
Furthermore, we utilize TVM for practical hardware deployment on the GPU's
integer arithmetic units, achieving 3.72~4.11 inference speedup
compared to the FP model
ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain
With the growing concern of AI safety, there is a need to trust the
computations done by machine learning (ML) models. Blockchain technology, known
for recording data and running computations transparently and in a tamper-proof
manner, can offer this trust. One significant challenge in deploying ML
Classifiers on-chain is that while ML models are typically written in Python
using an ML library such as Pytorch, smart contracts deployed on EVM-compatible
blockchains are written in Solidity. We introduce Machine Learning to Smart
Contract (ML2SC), a PyTorch to Solidity translator that can automatically
translate multi-layer perceptron (MLP) models written in Pytorch to Solidity
smart contract versions. ML2SC uses a fixed-point math library to approximate
floating-point computation. After deploying the generated smart contract, we
can train our models off-chain using PyTorch and then further transfer the
acquired weights and biases to the smart contract using a function call.
Finally, the model inference can also be done with a function call providing
the input. We mathematically model the gas costs associated with deploying,
updating model parameters, and running inference on these models on-chain,
showing that the gas costs increase linearly in various parameters associated
with an MLP. We present empirical results matching our modeling. We also
evaluate the classification accuracy showing that the outputs obtained by our
transparent on-chain implementation are identical to the original off-chain
implementation with Pytorch
Turbulence-induced oscillation on particle detachment from a wall
Particle resuspension is a ubiquitous phenomenon with pivotal relevance in
numerous natural and industrial contexts. In this study, we present findings on
the resuspension of individual micro-sized particles, captured through
high-speed camera experiments. Our observations reveal a universal behavior
whereby a particle undergoes oscillatory motion due to turbulent excitation
prior to its detachment from the surface. This motion is characterized by
dimensionless number Ad and S. The frequency of particles oscillation is
analyzed and it shows the frequency of particle oscillations increased with
decreasing particle size. We establish a new model that the particle is a
linear oscillator driven by stochastic torque from turbulence. It is shown that
the stochastic oscillation is the key mechanism for particle detachment from a
wall within a certain range of friction velocities
Extract Executable Action Sequences from Natural Language Instructions Based on DQN for Medical Service Robots
The emergence and popularization of medical robots bring great convenience to doctors in treating patients. The core of medical robots is the interaction and cooperation between doctors and robots, so it is crucial to design a simple and stable human-robots interaction system for medical robots. Language is the most convenient way for people to communicate with each other, so in this paper, a DQN agent based on long-short term memory (LSTM) and attention mechanism is proposed to enable the robots to extract executable action sequences from doctors’ natural language instructions. For this, our agent should be able to complete two related tasks: 1) extracting action names from instructions. 2) extracting action arguments according to the extracted action names. We evaluate our agent on three datasets composed of texts with an average length of 49.95, 209.34, 417.17 words respectively. The results show that our agent can perform better than similar agents. And our agent has a better ability to handle long texts than previous works
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Data-free quantization can potentially address data privacy and security
concerns in model compression, and thus has been widely investigated. Recently,
PSAQ-ViT designs a relative value metric, patch similarity, to generate data
from pre-trained vision transformers (ViTs), achieving the first attempt at
data-free quantization for ViTs. In this paper, we propose PSAQ-ViT V2, a more
accurate and general data-free quantization framework for ViTs, built on top of
PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT,
we introduce an adaptive teacher-student strategy, which facilitates the
constant cyclic evolution of the generated samples and the quantized model
(student) in a competitive and interactive fashion under the supervision of the
full-precision model (teacher), thus significantly improving the accuracy of
the quantized model. Moreover, without the auxiliary category guidance, we
employ the task- and model-independent prior information, making the
general-purpose scheme compatible with a broad range of vision tasks and
models. Extensive experiments are conducted on various models on image
classification, object detection, and semantic segmentation tasks, and PSAQ-ViT
V2, with the naive quantization strategy and without access to real-world data,
consistently achieves competitive results, showing potential as a powerful
baseline on data-free quantization for ViTs. For instance, with Swin-S as the
(backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet,
50.9 box AP and 44.1 mask AP on COCO, and 47.2 mIoU on ADE20K. We hope that
accurate and general PSAQ-ViT V2 can serve as a potential and practice solution
in real-world applications involving sensitive data. Code is released and
merged at: https://github.com/zkkli/PSAQ-ViT.Comment: Accepted by TNNLS 202
Analysis of bronchovascular patterns in the left superior division segment to explore the relationship between the descending bronchus and the artery crossing intersegmental planes
BackgroundA comprehensive understanding of the anatomical variations in the pulmonary bronchi and arteries is particularly essential to the implementation of safe and precise left superior division segment (LSDS) segmentectomy. However, no report shows the relationship between the descending bronchus and the artery crossing intersegmental planes. Thus, the purpose of the present study was to analyze the branching pattern of the pulmonary artery and bronchus in LSDS using three-dimensional computed tomography bronchography and angiography (3D-CTBA) and to explore the associated pulmonary anatomical features of the artery crossing intersegmental planes.Materials and methodsThe 3D-CTBA images of 540 cases were retrospectively analyzed. We reviewed the anatomical variations of the LSDS bronchus and artery and assorted them according to different classifications.ResultsAmong all 540 cases of 3D-CTBA, there were 16 cases (44.4%) with lateral subsegmental artery crossing intersegmental planes (AX3a), 20 cases (55.6%) Without AX3a in the descending B3a or B3 type, and 53 cases (10.5%) with AX3a, 451 cases (89.5%) Without AX3a in the Without the descending B3a or B3 type. This illustrated that the AX3a was more common in the descending B3a or B3 type (P < 0.005). Similarly, there were 69 cases (36.1%) with horizontal subsegmental artery crossing intersegmental planes (AX1 + 2c), 122 cases (63.9%) Without AX1 + 2c in the descending B1 + 2c type, and 33 cases (9.5%) with AX1 + 2c, 316 cases (90.5%) Without AX1 + 2c in the Without the descending B1 + 2c type. Combinations of the branching patterns of the AX1 + 2c and the descending B1 + 2c type were significantly dependent (p < 0.005). The combinations of the branching patterns of the AX1 + 2c and the descending B1 + 2c type were frequently observed.ConclusionsThis is the first report to explore the relationship between the descending bronchus and the artery crossing intersegmental planes. In patients with the descending B3a or B3 type, the incidence of the AX3a was increased. Similarly, the incidence of the AX1 + 2c was increased in patients with the descending B1 + 2c type. These findings should be carefully identified when performing an accurate LSDS segmentectomy
- …
