541 research outputs found

    MINT: Multiplier-less Integer Quantization for Spiking Neural Networks

    Full text link
    We propose Multiplier-less INTeger (MINT) quantization, an efficient uniform quantization scheme for the weights and membrane potentials in spiking neural networks (SNNs). Unlike prior SNN quantization works, MINT quantizes the memory-hungry membrane potentials to extremely low precision (2-bit) to significantly reduce the total memory footprint. Additionally, MINT quantization shares the quantization scaling factor between the weights and membrane potentials, eliminating the need for multipliers that are necessary for vanilla uniform quantization. Experimental results demonstrate that our proposed method achieves accuracy that matches the full-precision models and other state-of-the-art SNN quantization works while outperforming them on total memory footprint and hardware cost at deployment. For instance, 2-bit MINT VGG-16 achieves 90.6% accuracy on CIFAR-10 with approximately 93.8% reduction in total memory footprint from the full-precision model; meanwhile, it reduces 90% computation energy compared to the vanilla uniform quantization at deployment.Comment: 6 pages. Accepted to 29th Asia and South Pacific Design Automation Conference (ASP-DAC 2024

    MIMONet: Multi-Input Multi-Output On-Device Deep Learning

    Full text link
    Future intelligent robots are expected to process multiple inputs simultaneously (such as image and audio data) and generate multiple outputs accordingly (such as gender and emotion), similar to humans. Recent research has shown that multi-input single-output (MISO) deep neural networks (DNN) outperform traditional single-input single-output (SISO) models, representing a significant step towards this goal. In this paper, we propose MIMONet, a novel on-device multi-input multi-output (MIMO) DNN framework that achieves high accuracy and on-device efficiency in terms of critical performance metrics such as latency, energy, and memory usage. Leveraging existing SISO model compression techniques, MIMONet develops a new deep-compression method that is specifically tailored to MIMO models. This new method explores unique yet non-trivial properties of the MIMO model, resulting in boosted accuracy and on-device efficiency. Extensive experiments on three embedded platforms commonly used in robotic systems, as well as a case study using the TurtleBot3 robot, demonstrate that MIMONet achieves higher accuracy and superior on-device efficiency compared to state-of-the-art SISO and MISO models, as well as a baseline MIMO model we constructed. Our evaluation highlights the real-world applicability of MIMONet and its potential to significantly enhance the performance of intelligent robotic systems.Comment: Submitted to ICRA 202

    Graph Neural Network-Enhanced Expectation Propagation Algorithm for MIMO Turbo Receivers

    Full text link
    Deep neural networks (NNs) are considered a powerful tool for balancing the performance and complexity of multiple-input multiple-output (MIMO) receivers due to their accurate feature extraction, high parallelism, and excellent inference ability. Graph NNs (GNNs) have recently demonstrated outstanding capability in learning enhanced message passing rules and have shown success in overcoming the drawback of inaccurate Gaussian approximation of expectation propagation (EP)-based MIMO detectors. However, the application of the GNN-enhanced EP detector to MIMO turbo receivers is underexplored and non-trivial due to the requirement of extrinsic information for iterative processing. This paper proposes a GNN-enhanced EP algorithm for MIMO turbo receivers, which realizes the turbo principle of generating extrinsic information from the MIMO detector through a specially designed training procedure. Additionally, an edge pruning strategy is designed to eliminate redundant connections in the original fully connected model of the GNN utilizing the correlation information inherently from the EP algorithm. Edge pruning reduces the computational cost dramatically and enables the network to focus more attention on the weights that are vital for performance. Simulation results and complexity analysis indicate that the proposed MIMO turbo receiver outperforms the EP turbo approaches by over 1 dB at the bit error rate of 10−510^{-5}, exhibits performance equivalent to state-of-the-art receivers with 2.5 times shorter running time, and adapts to various scenarios.Comment: 15 pages, 12 figures, 2 tables. This paper has been accepted for publication by the IEEE Transactions on Signal Processing. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Deep Learning Designs for Physical Layer Communications

    Get PDF
    Wireless communication systems and their underlying technologies have undergone unprecedented advances over the last two decades to assuage the ever-increasing demands for various applications and emerging technologies. However, the traditional signal processing schemes and algorithms for wireless communications cannot handle the upsurging complexity associated with fifth-generation (5G) and beyond communication systems due to network expansion, new emerging technologies, high data rate, and the ever-increasing demands for low latency. This thesis extends the traditional downlink transmission schemes to deep learning-based precoding and detection techniques that are hardware-efficient and of lower complexity than the current state-of-the-art. The thesis focuses on: precoding/beamforming in massive multiple-inputs-multiple-outputs (MIMO), signal detection and lightweight neural network (NN) architectures for precoder and decoder designs. We introduce a learning-based precoder design via constructive interference (CI) that performs the precoding on a symbol-by-symbol basis. Instead of conventionally training a NN without considering the specifics of the optimisation objective, we unfold a power minimisation symbol level precoding (SLP) formulation based on the interior-point-method (IPM) proximal ‘log’ barrier function. Furthermore, we propose a concept of NN compression, where the weights are quantised to lower numerical precision formats based on binary and ternary quantisations. We further introduce a stochastic quantisation technique, where parts of the NN weight matrix are quantised while the remaining is not. Finally, we propose a systematic complexity scaling of deep neural network (DNN) based MIMO detectors. The model uses a fraction of the DNN inputs by scaling their values through weights that follow monotonically non-increasing functions. Furthermore, we investigate performance complexity tradeoffs via regularisation constraints on the layer weights such that, at inference, parts of network layers can be removed with minimal impact on the detection accuracy. Simulation results show that our proposed learning-based techniques offer better complexity-vs-BER (bit-error-rate) and complexity-vs-transmit power performances compared to the state-of-the-art MIMO detection and precoding techniques

    The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification

    Full text link
    The use of modern Natural Language Processing (NLP) techniques has shown to be beneficial for software engineering tasks, such as vulnerability detection and type inference. However, training deep NLP models requires significant computational resources. This paper explores techniques that aim at achieving the best usage of resources and available information in these models. We propose a generic approach, EarlyBIRD, to build composite representations of code from the early layers of a pre-trained transformer model. We empirically investigate the viability of this approach on the CodeBERT model by comparing the performance of 12 strategies for creating composite representations with the standard practice of only using the last encoder layer. Our evaluation on four datasets shows that several early layer combinations yield better performance on defect detection, and some combinations improve multi-class classification. More specifically, we obtain a +2 average improvement of detection accuracy on Devign with only 3 out of 12 layers of CodeBERT and a 3.3x speed-up of fine-tuning. These findings show that early layers can be used to obtain better results using the same resources, as well as to reduce resource usage during fine-tuning and inference.Comment: The content in this pre-print is the same as in the CRC accepted for publication in the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023
    • …
    corecore