2 research outputs found
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
Pruning refers to the elimination of trivial weights from neural networks.
The sub-networks within an overparameterized model produced after pruning are
often called Lottery tickets. This research aims to generate winning lottery
tickets from a set of lottery tickets that can achieve similar accuracy to the
original unpruned network. We introduce a novel winning ticket called Cyclic
Overlapping Lottery Ticket (COLT) by data splitting and cyclic retraining of
the pruned network from scratch. We apply a cyclic pruning algorithm that keeps
only the overlapping weights of different pruned models trained on different
data segments. Our results demonstrate that COLT can achieve similar accuracies
(obtained by the unpruned model) while maintaining high sparsities. We show
that the accuracy of COLT is on par with the winning tickets of Lottery Ticket
Hypothesis (LTH) and, at times, is better. Moreover, COLTs can be generated
using fewer iterations than tickets generated by the popular Iterative
Magnitude Pruning (IMP) method. In addition, we also notice COLTs generated on
large datasets can be transferred to small ones without compromising
performance, demonstrating its generalizing capability. We conduct all our
experiments on Cifar-10, Cifar-100 & TinyImageNet datasets and report superior
performance than the state-of-the-art methods
LumiNet: The Bright Side of Perceptual Knowledge Distillation
In knowledge distillation literature, feature-based methods have dominated
due to their ability to effectively tap into extensive teacher models. In
contrast, logit-based approaches, which aim to distill `dark knowledge' from
teachers, typically exhibit inferior performance compared to feature-based
methods. To bridge this gap, we present LumiNet, a novel knowledge distillation
algorithm designed to enhance logit-based distillation. We introduce the
concept of 'perception', aiming to calibrate logits based on the model's
representation capability. This concept addresses overconfidence issues in
logit-based distillation method while also introducing a novel method to
distill knowledge from the teacher. It reconstructs the logits of a
sample/instances by considering relationships with other samples in the batch.
LumiNet excels on benchmarks like CIFAR-100, ImageNet, and MSCOCO,
outperforming leading feature-based methods, e.g., compared to KD with ResNet18
and MobileNetV2 on ImageNet, it shows improvements of 1.5% and 2.05%,
respectively