39 research outputs found

    From Statistical Physics to Algorithms in Deep Neural Systems

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Can we avoid Double Descent in Deep Neural Networks?

    Full text link
    Finding the optimal size of deep learning models is very actual and of broad impact, especially in energy-saving schemes. Very recently, an unexpected phenomenon, the ``double descent'', has caught the attention of the deep learning community. As the model's size grows, the performance gets first worse, and then goes back to improving. It raises serious questions about the optimal model's size to maintain high generalization: the model needs to be sufficiently over-parametrized, but adding too many parameters wastes training resources. Is it possible to find, in an efficient way, the best trade-off? Our work shows that the double descent phenomenon is potentially avoidable with proper conditioning of the learning problem, but a final answer is yet to be found. We empirically observe that there is hope to dodge the double descent in complex scenarios with proper regularization, as a simple â„“2\ell_2 regularization is already positively contributing to such a perspective

    EnD: Entangling and Disentangling deep representations for bias correction

    Get PDF
    Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. There are problems, like the presence of biases in the training data, which question the generalization capability of these models. In this work we propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias, still letting the useful information for the training task forward-propagating in the rest of the model. One big advantage of EnD is that we do not require additional training complexity (like decoders or extra layers in the model), since it is a regularizer directly applied on the trained model. Our experiments show that EnD effectively improves the generalization on unbiased test sets, and it can be effectively applied on real-case scenarios, like removing hidden biases in the COVID-19 detection from radiographic images

    On the Role of Structured Pruning for Neural Network Compression

    Get PDF
    International audienc

    LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks

    Full text link
    LOBSTER (LOss-Based SensiTivity rEgulaRization) is a method for training neural networks having a sparse topology. Let the sensitivity of a network parameter be the variation of the loss function with respect to the variation of the parameter. Parameters with low sensitivity, i.e. having little impact on the loss when perturbed, are shrunk and then pruned to sparsify the network. Our method allows to train a network from scratch, i.e. without preliminary learning or rewinding. Experiments on multiple architectures and datasets show competitive compression ratios with minimal computational overhead