Search CORE

22 research outputs found

Towards Compute-Optimal Transfer Learning

Author: Caccia Massimo
Charlin Laurent
Douillard Arthur
Galashov Alexandre
Paganini Michela
Pascanu Razvan
Rannen-Triki Amal
Ranzato Marc'Aurelio
Rao Dushyant
Publication venue
Publication date: 25/04/2023
Field of study

The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes

arXiv.org e-Print Archive

Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

Author: Arora Viplove
Goldt Sebastian
Irto Daniele
Sanguinetti Guido
Publication venue
Publication date: 21/06/2023
Field of study

Pruning deep neural networks is a widely used strategy to alleviate the computational burden in machine learning. Overwhelming empirical evidence suggests that pruned models retain very high accuracy even with a tiny fraction of parameters. However, relatively little work has gone into characterising the small pruned networks obtained, beyond a measure of their accuracy. In this paper, we use the sparse double descent approach to identify univocally and characterise pruned models associated with classification tasks. We observe empirically that, for a given task, iterative magnitude pruning (IMP) tends to converge to networks of comparable sizes even when starting from full networks with sizes ranging over orders of magnitude. We analyse the best pruned models in a controlled experimental setup and show that their number of parameters reflects task difficulty and that they are much better than full networks at capturing the true conditional probability distribution of the labels. On real data, we similarly observe that pruned models are less prone to overconfident predictions. Our results suggest that pruned models obtained via IMP not only have advantageous computational properties but also provide a better representation of uncertainty in learning

arXiv.org e-Print Archive

Sparse Neural Network Training with In-Time Over-Parameterization

Author: Liu Shiwei
Publication venue: Eindhoven University of Technology
Publication date: 06/04/2022
Field of study

Pure OAI Repository

Sparse Neural Network Training with In-Time Over-Parameterization

Author: Liu Shiwei
Publication venue: Eindhoven University of Technology
Publication date: 06/04/2022
Field of study

Pure OAI Repository

Model pruning enables efficient federated learning on edge devices

Author: Bong Jun K
Jiang Y
Leung K
Tassiulas L
Valls V
Wang S
Wei-Han L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/01/2020
Field of study

Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a data center. To overcome this challenge, we propose PruneFL--a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: 1) we significantly reduce the training time compared to conventional FL and various other pruning-based methods and 2) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

NOVA INFORMACIJSKA TEHNOLOGIJA PROCJENE KORISTI IZDVAJANJA CESTA POMOĆU SATELITSKIH SNIMKI VISOKE REZOLUCIJE TEMELJENE NA PCNN I C-V MODELU

Author: D. Murugan
Ganesh Kumar
R. Kavitha
T.I. Manish
Publication venue: 'Croatian Communication Association'
Publication date: 01/01/2014
Field of study

Road extraction from high resolution satellite images has been an important research topic for analysis of urban areas. In this paper road extraction based on PCNN and Chan-Vese active contour model are compared. It is difficult and computationally expensive to extract roads from the original image due to presences of other road-like features with straight edges. The image is pre-processed using median filter to reduce the noise. Then road extraction is performed using PCNN and Chan-Vese active contour model. Nonlinear segments are removed using morphological operations. Finally the accuracy for the road extracted images is evaluated based on quality measures.Izdvajanje cesta pomoću satelitskih slika visoke rezolucije je važna istraživačka tema za analizu urbanih područja. U ovom radu ekstrakcije ceste se uspoređuju na PCNN i Chan-Vese aktivnom modelu. Teško je i računalno skupo izdvojiti ceste iz originalne slike zbog prisutnosti drugih elemenata ravnih rubova sličnih cestama. Slika je prethodno obrađena korištenjem filtera za smanjenje smetnji. Zatim se ekstrakcija ceste izvodi pomoću PCNN i Chan-Vese aktivnog modela konture. Nelinearni segmenti su uklonjeni primjenom morfoloških operacija. Konačno, točnost za ceste izdvojene iz slika se ocjenjuje na temelju kvalitativnih mjera

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia