75 research outputs found

    A Multi-Dimensional Pruning Framework Based on Double Evaluation Mechanism

    Get PDF
    In this article, we propose EMNAPE, a multi-dimensional pruning framework to collaboratively prune the three dimensions (depth, width, and resolution) of convolutional neural networks (CNNs) for better execution efficiency on embedded hardware. In EMNAPE, we first introduce a two-stage importance evaluation framework, which efficiently and comprehensively evaluates each pruning unit according to both the local importance inside each dimension and the global importance across different dimensions. Based on the evaluation framework, we present a heuristic pruning algorithm to progressively prune the three dimensions of CNNs toward the optimal trade-off between accuracy and efficiency. Experiments on multiple benchmarks validate the advantages of EMNAPE over existing state-of-the-art (SOTA) approaches

    SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference

    Full text link
    Deep neural networks (DNNs) demonstrate outstanding performance across most computer vision tasks. Some critical applications, such as autonomous driving or medical imaging, also require investigation into their behavior and the reasons behind the decisions they make. In this vein, DNN attribution consists in studying the relationship between the predictions of a DNN and its inputs. Attribution methods have been adapted to highlight the most relevant weights or neurons in a DNN, allowing to more efficiently select which weights or neurons can be pruned. However, a limitation of these approaches is that weights are typically compared within each layer separately, while some layers might appear as more critical than others. In this work, we propose to investigate DNN layer importance, i.e. to estimate the sensitivity of the accuracy w.r.t. perturbations applied at the layer level. To do so, we propose a novel dataset to evaluate our method as well as future works. We benchmark a number of criteria and draw conclusions regarding how to assess DNN layer importance and, consequently, how to budgetize layers for increased DNN efficiency (with applications for DNN pruning and quantization), as well as robustness to hardware failure (e.g. bit swaps)
    • …
    corecore