22 research outputs found

    Towards Compute-Optimal Transfer Learning

    Full text link
    The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes

    Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

    Full text link
    Pruning deep neural networks is a widely used strategy to alleviate the computational burden in machine learning. Overwhelming empirical evidence suggests that pruned models retain very high accuracy even with a tiny fraction of parameters. However, relatively little work has gone into characterising the small pruned networks obtained, beyond a measure of their accuracy. In this paper, we use the sparse double descent approach to identify univocally and characterise pruned models associated with classification tasks. We observe empirically that, for a given task, iterative magnitude pruning (IMP) tends to converge to networks of comparable sizes even when starting from full networks with sizes ranging over orders of magnitude. We analyse the best pruned models in a controlled experimental setup and show that their number of parameters reflects task difficulty and that they are much better than full networks at capturing the true conditional probability distribution of the labels. On real data, we similarly observe that pruned models are less prone to overconfident predictions. Our results suggest that pruned models obtained via IMP not only have advantageous computational properties but also provide a better representation of uncertainty in learning

    Sparse Neural Network Training with In-Time Over-Parameterization

    Get PDF

    Sparse Neural Network Training with In-Time Over-Parameterization

    Get PDF

    Model pruning enables efficient federated learning on edge devices

    Get PDF
    Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a data center. To overcome this challenge, we propose PruneFL--a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: 1) we significantly reduce the training time compared to conventional FL and various other pruning-based methods and 2) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model

    NOVA INFORMACIJSKA TEHNOLOGIJA PROCJENE KORISTI IZDVAJANJA CESTA POMOĆU SATELITSKIH SNIMKI VISOKE REZOLUCIJE TEMELJENE NA PCNN I C-V MODELU

    Get PDF
    Road extraction from high resolution satellite images has been an important research topic for analysis of urban areas. In this paper road extraction based on PCNN and Chan-Vese active contour model are compared. It is difficult and computationally expensive to extract roads from the original image due to presences of other road-like features with straight edges. The image is pre-processed using median filter to reduce the noise. Then road extraction is performed using PCNN and Chan-Vese active contour model. Nonlinear segments are removed using morphological operations. Finally the accuracy for the road extracted images is evaluated based on quality measures.Izdvajanje cesta pomoću satelitskih slika visoke rezolucije je važna istraživačka tema za analizu urbanih područja. U ovom radu ekstrakcije ceste se uspoređuju na PCNN i Chan-Vese aktivnom modelu. Teško je i računalno skupo izdvojiti ceste iz originalne slike zbog prisutnosti drugih elemenata ravnih rubova sličnih cestama. Slika je prethodno obrađena korištenjem filtera za smanjenje smetnji. Zatim se ekstrakcija ceste izvodi pomoću PCNN i Chan-Vese aktivnog modela konture. Nelinearni segmenti su uklonjeni primjenom morfoloških operacija. Konačno, točnost za ceste izdvojene iz slika se ocjenjuje na temelju kvalitativnih mjera
    corecore