790 research outputs found

    I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

    Full text link
    Vision Transformers (ViTs) have achieved state-of-the-art performance on various computer vision applications. These models, however, have considerable storage and computational overheads, making their deployment and efficient inference on edge devices challenging. Quantization is a promising approach to reducing model complexity; unfortunately, existing efforts to quantize ViTs are simulated quantization (aka fake quantization), which remains floating-point arithmetic during inference and thus contributes little to model acceleration. In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs, to enable ViTs to perform the entire computational graph of inference with integer operations and bit-shifting and no floating-point operations. In I-ViT, linear operations (e.g., MatMul and Dense) follow the integer-only pipeline with dyadic arithmetic, and non-linear operations (e.g., Softmax, GELU, and LayerNorm) are approximated by the proposed light-weight integer-only arithmetic methods. In particular, I-ViT applies the proposed Shiftmax and ShiftGELU, which are designed to use integer bit-shifting to approximate the corresponding floating-point operations. We evaluate I-ViT on various benchmark models and the results show that integer-only INT8 quantization achieves comparable (or even higher) accuracy to the full-precision (FP) baseline. Furthermore, we utilize TVM for practical hardware deployment on the GPU's integer arithmetic units, achieving 3.72~4.11×\times inference speedup compared to the FP model

    ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain

    Full text link
    With the growing concern of AI safety, there is a need to trust the computations done by machine learning (ML) models. Blockchain technology, known for recording data and running computations transparently and in a tamper-proof manner, can offer this trust. One significant challenge in deploying ML Classifiers on-chain is that while ML models are typically written in Python using an ML library such as Pytorch, smart contracts deployed on EVM-compatible blockchains are written in Solidity. We introduce Machine Learning to Smart Contract (ML2SC), a PyTorch to Solidity translator that can automatically translate multi-layer perceptron (MLP) models written in Pytorch to Solidity smart contract versions. ML2SC uses a fixed-point math library to approximate floating-point computation. After deploying the generated smart contract, we can train our models off-chain using PyTorch and then further transfer the acquired weights and biases to the smart contract using a function call. Finally, the model inference can also be done with a function call providing the input. We mathematically model the gas costs associated with deploying, updating model parameters, and running inference on these models on-chain, showing that the gas costs increase linearly in various parameters associated with an MLP. We present empirical results matching our modeling. We also evaluate the classification accuracy showing that the outputs obtained by our transparent on-chain implementation are identical to the original off-chain implementation with Pytorch

    Turbulence-induced oscillation on particle detachment from a wall

    Full text link
    Particle resuspension is a ubiquitous phenomenon with pivotal relevance in numerous natural and industrial contexts. In this study, we present findings on the resuspension of individual micro-sized particles, captured through high-speed camera experiments. Our observations reveal a universal behavior whereby a particle undergoes oscillatory motion due to turbulent excitation prior to its detachment from the surface. This motion is characterized by dimensionless number Ad and S. The frequency of particles oscillation is analyzed and it shows the frequency of particle oscillations increased with decreasing particle size. We establish a new model that the particle is a linear oscillator driven by stochastic torque from turbulence. It is shown that the stochastic oscillation is the key mechanism for particle detachment from a wall within a certain range of friction velocities

    Extract Executable Action Sequences from Natural Language Instructions Based on DQN for Medical Service Robots

    Get PDF
    The emergence and popularization of medical robots bring great convenience to doctors in treating patients. The core of medical robots is the interaction and cooperation between doctors and robots, so it is crucial to design a simple and stable human-robots interaction system for medical robots. Language is the most convenient way for people to communicate with each other, so in this paper, a DQN agent based on long-short term memory (LSTM) and attention mechanism is proposed to enable the robots to extract executable action sequences from doctors’ natural language instructions. For this, our agent should be able to complete two related tasks: 1) extracting action names from instructions. 2) extracting action arguments according to the extracted action names. We evaluate our agent on three datasets composed of texts with an average length of 49.95, 209.34, 417.17 words respectively. The results show that our agent can perform better than similar agents. And our agent has a better ability to handle long texts than previous works

    PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers

    Full text link
    Data-free quantization can potentially address data privacy and security concerns in model compression, and thus has been widely investigated. Recently, PSAQ-ViT designs a relative value metric, patch similarity, to generate data from pre-trained vision transformers (ViTs), achieving the first attempt at data-free quantization for ViTs. In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT, we introduce an adaptive teacher-student strategy, which facilitates the constant cyclic evolution of the generated samples and the quantized model (student) in a competitive and interactive fashion under the supervision of the full-precision model (teacher), thus significantly improving the accuracy of the quantized model. Moreover, without the auxiliary category guidance, we employ the task- and model-independent prior information, making the general-purpose scheme compatible with a broad range of vision tasks and models. Extensive experiments are conducted on various models on image classification, object detection, and semantic segmentation tasks, and PSAQ-ViT V2, with the naive quantization strategy and without access to real-world data, consistently achieves competitive results, showing potential as a powerful baseline on data-free quantization for ViTs. For instance, with Swin-S as the (backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet, 50.9 box AP and 44.1 mask AP on COCO, and 47.2 mIoU on ADE20K. We hope that accurate and general PSAQ-ViT V2 can serve as a potential and practice solution in real-world applications involving sensitive data. Code is released and merged at: https://github.com/zkkli/PSAQ-ViT.Comment: Accepted by TNNLS 202

    Analysis of bronchovascular patterns in the left superior division segment to explore the relationship between the descending bronchus and the artery crossing intersegmental planes

    Get PDF
    BackgroundA comprehensive understanding of the anatomical variations in the pulmonary bronchi and arteries is particularly essential to the implementation of safe and precise left superior division segment (LSDS) segmentectomy. However, no report shows the relationship between the descending bronchus and the artery crossing intersegmental planes. Thus, the purpose of the present study was to analyze the branching pattern of the pulmonary artery and bronchus in LSDS using three-dimensional computed tomography bronchography and angiography (3D-CTBA) and to explore the associated pulmonary anatomical features of the artery crossing intersegmental planes.Materials and methodsThe 3D-CTBA images of 540 cases were retrospectively analyzed. We reviewed the anatomical variations of the LSDS bronchus and artery and assorted them according to different classifications.ResultsAmong all 540 cases of 3D-CTBA, there were 16 cases (44.4%) with lateral subsegmental artery crossing intersegmental planes (AX3a), 20 cases (55.6%) Without AX3a in the descending B3a or B3 type, and 53 cases (10.5%) with AX3a, 451 cases (89.5%) Without AX3a in the Without the descending B3a or B3 type. This illustrated that the AX3a was more common in the descending B3a or B3 type (P < 0.005). Similarly, there were 69 cases (36.1%) with horizontal subsegmental artery crossing intersegmental planes (AX1 + 2c), 122 cases (63.9%) Without AX1 + 2c in the descending B1 + 2c type, and 33 cases (9.5%) with AX1 + 2c, 316 cases (90.5%) Without AX1 + 2c in the Without the descending B1 + 2c type. Combinations of the branching patterns of the AX1 + 2c and the descending B1 + 2c type were significantly dependent (p < 0.005). The combinations of the branching patterns of the AX1 + 2c and the descending B1 + 2c type were frequently observed.ConclusionsThis is the first report to explore the relationship between the descending bronchus and the artery crossing intersegmental planes. In patients with the descending B3a or B3 type, the incidence of the AX3a was increased. Similarly, the incidence of the AX1 + 2c was increased in patients with the descending B1 + 2c type. These findings should be carefully identified when performing an accurate LSDS segmentectomy
    corecore