813 research outputs found

    Differentiable Sparsification for Deep Neural Networks

    Full text link
    Deep neural networks have relieved a great deal of burden on human experts in relation to feature engineering. However, comparable efforts are instead required to determine effective architectures. In addition, as the sizes of networks have grown overly large, a considerable amount of resources is also invested in reducing the sizes. The sparsification of an over-complete model addresses these problems as it removes redundant components and connections. In this study, we propose a fully differentiable sparsification method for deep neural networks which allows parameters to be zero during training via stochastic gradient descent. Thus, the proposed method can learn the sparsified structure and weights of a network in an end-to-end manner. The method is directly applicable to various modern deep neural networks and imposes minimum modification to existing models. To the best of our knowledge, this is the first fully [sub-]differentiable sparsification method that zeroes out parameters. It provides a foundation for future structure learning and model compression methods
    • …
    corecore