3 research outputs found

    NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

    Full text link
    Boosting the task accuracy of tiny neural networks (TNNs) has become a fundamental challenge for enabling the deployments of TNNs on edge devices which are constrained by strict limitations in terms of memory, computation, bandwidth, and power supply. To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharing teacher constructed by expanding the number of channels of the TNN. Specifically, the target TNN model is jointly trained with the weight-sharing teacher model via (1) gradient surgery to tackle the gradient conflicts between them and (2) uncertainty-aware distillation to mitigate the overfitting of the teacher model. Extensive experiments across diverse tasks validate NetDistiller's effectiveness in boosting TNNs' achievable accuracy over state-of-the-art methods. Our code is available at https://github.com/GATECH-EIC/NetDistiller

    EFFICIENT AND SCALABLE MACHINE LEARNING FOR DISTRIBUTED EDGE INTELLIGENCE

    No full text
    In the era of big data and IoT, devices at the edges are becoming increasingly intelligent, and processing the data closest to the sources is paramount. However, conventional machine learning works with the centralized framework of collecting data from various edge sources and storing it on the high-performance cloud to support computationally intensive iterative algorithms. As a result, such a framework is infeasible for training on embedded edge devices with limited resources and a tight power budget. This dissertation proposes to integrate ideas from fields of machine learning and systems for designing efficient and scalable algorithms for distributed training of machine learning models amenable for edge computing with limited hardware and computing resources. The resulting decentralized machine learning framework aims to keep the data private, reduce latency, save communication bandwidth, be energy-efficient, and handle streaming data

    Fast and Communication-Efficient Algorithm for Distributed Support Vector Machine Training

    No full text
    corecore