22 research outputs found
Towards Compute-Optimal Transfer Learning
The field of transfer learning is undergoing a significant shift with the
introduction of large pretrained models which have demonstrated strong
adaptability to a variety of downstream tasks. However, the high computational
and memory requirements to finetune or use these models can be a hindrance to
their widespread use. In this study, we present a solution to this issue by
proposing a simple yet effective way to trade computational efficiency for
asymptotic performance which we define as the performance a learning algorithm
achieves as compute tends to infinity. Specifically, we argue that zero-shot
structured pruning of pretrained models allows them to increase compute
efficiency with minimal reduction in performance. We evaluate our method on the
Nevis'22 continual learning benchmark that offers a diverse set of transfer
scenarios. Our results show that pruning convolutional filters of pretrained
models can lead to more than 20% performance improvement in low computational
regimes
Quantifying lottery tickets under label noise: accuracy, calibration, and complexity
Pruning deep neural networks is a widely used strategy to alleviate the
computational burden in machine learning. Overwhelming empirical evidence
suggests that pruned models retain very high accuracy even with a tiny fraction
of parameters. However, relatively little work has gone into characterising the
small pruned networks obtained, beyond a measure of their accuracy. In this
paper, we use the sparse double descent approach to identify univocally and
characterise pruned models associated with classification tasks. We observe
empirically that, for a given task, iterative magnitude pruning (IMP) tends to
converge to networks of comparable sizes even when starting from full networks
with sizes ranging over orders of magnitude. We analyse the best pruned models
in a controlled experimental setup and show that their number of parameters
reflects task difficulty and that they are much better than full networks at
capturing the true conditional probability distribution of the labels. On real
data, we similarly observe that pruned models are less prone to overconfident
predictions. Our results suggest that pruned models obtained via IMP not only
have advantageous computational properties but also provide a better
representation of uncertainty in learning
Model pruning enables efficient federated learning on edge devices
Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a data center. To overcome this challenge, we propose PruneFL--a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: 1) we significantly reduce the training time compared to conventional FL and various other pruning-based methods and 2) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model
NOVA INFORMACIJSKA TEHNOLOGIJA PROCJENE KORISTI IZDVAJANJA CESTA POMOĆU SATELITSKIH SNIMKI VISOKE REZOLUCIJE TEMELJENE NA PCNN I C-V MODELU
Road extraction from high resolution satellite images has been an important research topic for analysis of urban areas. In this paper road extraction based on PCNN and Chan-Vese active contour model are compared. It is difficult and computationally expensive to extract roads from the original image due to presences of other road-like features with straight edges. The image is pre-processed using median filter to reduce the noise. Then road extraction is performed using PCNN and Chan-Vese active contour model. Nonlinear segments are removed using morphological operations. Finally the accuracy for the road extracted images is evaluated based on quality measures.Izdvajanje cesta pomoću satelitskih slika visoke rezolucije je važna istraživačka tema za analizu urbanih područja. U ovom radu ekstrakcije ceste se uspoređuju na PCNN i Chan-Vese aktivnom modelu. Teško je i računalno skupo izdvojiti ceste iz originalne slike zbog prisutnosti drugih elemenata ravnih rubova sličnih cestama. Slika je prethodno obrađena korištenjem filtera za smanjenje smetnji. Zatim se ekstrakcija ceste izvodi pomoću PCNN i Chan-Vese aktivnog modela konture. Nelinearni segmenti su uklonjeni primjenom morfoloških operacija. Konačno, točnost za ceste izdvojene iz slika se ocjenjuje na temelju kvalitativnih mjera