210 research outputs found

    Towards Effective Low-bitwidth Convolutional Neural Networks

    Full text link
    This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations. Optimizing a low-precision network is very challenging since the training process can easily get trapped in a poor local minima, which results in substantial accuracy loss. To mitigate this problem, we propose three simple-yet-effective approaches to improve the network training. First, we propose to use a two-stage optimization strategy to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and then quantized activations. This is in contrast to the traditional methods which optimize them simultaneously. Second, following a similar spirit of the first method, we propose another progressive optimization approach which progressively decreases the bit-width from high-precision to low-precision during the course of training. Third, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training. Extensive experiments on various datasets ( i.e., CIFAR-100 and ImageNet) show the effectiveness of the proposed methods. To highlight, using our methods to train a 4-bit precision network leads to no performance decrease in comparison with its full-precision counterpart with standard network architectures ( i.e., AlexNet and ResNet-50).Comment: 11 page

    A novel and simple method for construction of recombinant adenoviruses

    Get PDF
    Recombinant adenoviruses have been widely used for various applications, including protein expression and gene therapy. We herein report a new and simple cloning approach to an efficient and robust construction of recombinant adenoviral genomes based on the mating-assisted genetically integrated cloning (MAGIC) strategy. The production of recombinant adenovirus serotype 5-based vectors was greatly facilitated by the use of the MAGIC procedure and the development of the Adeasyâ„¢ adenoviral vector system. The recombinant adenoviral plasmid can be generated by a direct and seamless substitution, which replaces the stuff fragment in a full-length adenoviral genome with the gene of interest in a small plasmid in Escherichia coli. Recombinant adenoviral plasmids can be rapidly constructed in vivo by using the new method, without manipulations of the large adenoviral genome. In contrast to other traditional systems, it reduces the need for multiple in vitro manipulations, such as endonuclease cleavage, ligation and transformation, thus achieving a higher efficiency with negligible background. This strategy has been proven to be suitable for constructing an adenoviral cDNA expression library. In summary, the new method is highly efficient, technically less demanding and less labor-intensive for constructing recombinant adenoviruses, which will be beneficial for functional genomic and proteomic researches in mammalian cells

    AQD: Towards Accurate Fully-Quantized Object Detection

    Full text link
    Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. However, designing aggressively low-bit (e.g., 2-bit) quantization schemes on complex tasks, such as object detection, still remains challenging in terms of severe performance degradation and unverifiable efficiency on common hardware. In this paper, we propose an Accurate Quantized object Detection solution, termed AQD, to fully get rid of floating-point computation. To this end, we target using fixed-point operations in all kinds of layers, including the convolutional layers, normalization layers, and skip connections, allowing the inference to be executed using integer-only arithmetic. To demonstrate the improved latency-vs-accuracy trade-off, we apply the proposed methods on RetinaNet and FCOS. In particular, experimental results on MS-COCO dataset show that our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes, which is of great practical value. Source code and models are available at: https://github.com/ziplab/QToolComment: CVPR 2021 Ora

    Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

    Full text link
    This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may result in substantial accuracy loss. To address this, we propose three practical approaches, including (i) progressive quantization; (ii) stochastic precision; and (iii) joint knowledge distillation to improve the network training. First, for progressive quantization, we propose two schemes to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and subsequently quantize activations. This is in contrast to the traditional methods which optimize them simultaneously. Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training. Second, to alleviate the excessive training burden due to the multi-round training stages, we further propose a one-stage stochastic precision strategy to randomly sample and quantize sub-networks while keeping other parts in full-precision. Finally, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training and significantly improves the performance of the low-precision network. Extensive experiments on various datasets (e.g., CIFAR-100, ImageNet) show the effectiveness of the proposed methods.Comment: Accepted to IEEE T. Pattern Analysis and Machine Intelligence (TPAMI). Extended version of arXiv:1711.00205 (CVPR 2018

    Structured Binary Neural Networks for Image Recognition

    Full text link
    We propose methods to train convolutional neural networks (CNNs) with both binarized weights and activations, leading to quantized models that are specifically friendly to mobile devices with limited power capacity and computation resources. Previous works on quantizing CNNs often seek to approximate the floating-point information using a set of discrete values, which we call value approximation, typically assuming the same architecture as the full-precision networks. Here we take a novel "structure approximation" view of quantization -- it is very likely that different architectures designed for low-bit networks may be better for achieving good performance. In particular, we propose a "network decomposition" strategy, termed Group-Net, in which we divide the network into groups. Thus, each full-precision group can be effectively reconstructed by aggregating a set of homogeneous binary branches. In addition, we learn effective connections among groups to improve the representation capability. Moreover, the proposed Group-Net shows strong generalization to other tasks. For instance, we extend Group-Net for accurate semantic segmentation by embedding rich context into the binary structure. Furthermore, for the first time, we apply binary neural networks to object detection. Experiments on both classification, semantic segmentation and object detection tasks demonstrate the superior performance of the proposed methods over various quantized networks in the literature. Our methods outperform the previous best binary neural networks in terms of accuracy and computation efficiency.Comment: 15 pages. Extended version of the conference version arXiv:1811.1041

    LBS: Loss-aware Bit Sharing for Automatic Model Compression

    Full text link
    Low-bitwidth model compression is an effective method to reduce the model size and computational overhead. Existing compression methods rely on some compression configurations (such as pruning rates, and/or bitwidths), which are often determined manually and not optimal. Some attempts have been made to search them automatically, but the optimization process is often very expensive. To alleviate this, we devise a simple yet effective method named Loss-aware Bit Sharing (LBS) to automatically search for optimal model compression configurations. To this end, we propose a novel single-path model to encode all candidate compression configurations, where a high bitwidth quantized value can be decomposed into the sum of the lowest bitwidth quantized value and a series of re-assignment offsets. We then introduce learnable binary gates to encode the choice of bitwidth, including filter-wise 0-bit for filter pruning. By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined. Extensive experiments on both CIFAR-100 and ImageNet show that LBS is able to significantly reduce computational cost while preserving promising performance.Comment: 22 page

    Block-Sparse Coding-Based Machine Learning Approach for Dependable Device-Free Localization in IoT Environment

    Get PDF
    Device-free localization (DFL) locates targets without equipping with wireless devices or tag under the Internet-of-Things (IoT) architectures. As an emerging technology, DFL has spawned extensive applications in IoT environment, such as intrusion detection, mobile robot localization, and location-based services. Current DFL-related machine learning (ML) algorithms still suffer from low localization accuracy and weak dependability/robustness because the group structure has not been considered in their location estimation, which leads to a undependable process. To overcome these challenges, we propose in this work a dependable block-sparse scheme by particularly considering the group structure of signals. An accurate and robust ML algorithm named block-sparse coding with the proximal operator (BSCPO) is proposed for DFL. In addition, a severe Gaussian noise is added in the original sensing signals for preserving network-related privacy as well as improving the dependability of model. The real-world data-driven experimental results show that the proposed BSCPO achieves robust localization and signal-recovery performance even under severely noisy conditions and outperforms state-of-the-art DFL methods. For single-target localization, BSCPO retains high accuracy when the signal-to-noise ratio exceeds-10 dB. BSCPO is also able to localize accurately under most multitarget localization test cases
    • …
    corecore