6,162 research outputs found

    Weight-dependent Gates for Differentiable Neural Network Pruning

    Full text link
    In this paper, we propose a simple and effective network pruning framework, which introduces novel weight-dependent gates to prune filter adaptively. We argue that the pruning decision should depend on the convolutional weights, in other words, it should be a learnable function of filter weights. We thus construct the weight-dependent gates (W-Gates) to learn the information from filter weights and obtain binary filter gates to prune or keep the filters automatically. To prune the network under hardware constraint, we train a Latency Predict Net (LPNet) to estimate the hardware latency of candidate pruned networks. Based on the proposed LPNet, we can optimize W-Gates and the pruning ratio of each layer under latency constraint. The whole framework is differentiable and can be optimized by gradient-based method to achieve a compact network with better trade-off between accuracy and efficiency. We have demonstrated the effectiveness of our method on Resnet34, Resnet50 and MobileNet V2, achieving up to 1.33/1.28/1.1 higher Top-1 accuracy with lower hardware latency on ImageNet. Compared with state-of-the-art pruning methods, our method achieves superior performance.Comment: ECCV worksho

    Restricted Recurrent Neural Networks

    Full text link
    Recurrent Neural Network (RNN) and its variations such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), have become standard building blocks for learning online data of sequential nature in many research areas, including natural language processing and speech data analysis. In this paper, we present a new methodology to significantly reduce the number of parameters in RNNs while maintaining performance that is comparable or even better than classical RNNs. The new proposal, referred to as Restricted Recurrent Neural Network (RRNN), restricts the weight matrices corresponding to the input data and hidden states at each time step to share a large proportion of parameters. The new architecture can be regarded as a compression of its classical counterpart, but it does not require pre-training or sophisticated parameter fine-tuning, both of which are major issues in most existing compression techniques. Experiments on natural language modeling show that compared with its classical counterpart, the restricted recurrent architecture generally produces comparable results at about 50\% compression rate. In particular, the Restricted LSTM can outperform classical RNN with even less number of parameters

    Approximate Approximation on a Quantum Annealer

    Full text link
    Many problems of industrial interest are NP-complete, and quickly exhaust resources of computational devices with increasing input sizes. Quantum annealers (QA) are physical devices that aim at this class of problems by exploiting quantum mechanical properties of nature. However, they compete with efficient heuristics and probabilistic or randomised algorithms on classical machines that allow for finding approximate solutions to large NP-complete problems. While first implementations of QA have become commercially available, their practical benefits are far from fully explored. To the best of our knowledge, approximation techniques have not yet received substantial attention. In this paper, we explore how problems' approximate versions of varying degree can be systematically constructed for quantum annealer programs, and how this influences result quality or the handling of larger problem instances on given set of qubits. We illustrate various approximation techniques on both, simulations and real QA hardware, on different seminal problems, and interpret the results to contribute towards a better understanding of the real-world power and limitations of current-state and future quantum computing.Comment: Proceedings of the 17th ACM International Conference on Computing Frontiers (CF 2020
    • …
    corecore