Machine learning (ML) software has become integral to various aspects of daily life, leveraging
complex models such as deep neural networks (DNNs) that entail significant computational
costs, especially during inference. This poses a challenge for the deployment of ML software
on resource-limited embedded devices and raises environmental concerns due to high energy
consumption. Addressing these challenges requires improving the efficiency of the ML
software.
ML software consists of three core components: data, model, and program. This dissertation
investigates efficiency optimization for these components. Existing research primarily focuses
on model and program optimization but overlooks the critical role of data. Additionally, the
robustness of model-level optimizations and the compatibility of program-level optimizations
with model-level ones remain underexplored.
This dissertation aims to address these gaps. At the data level, it examines how training
data impacts ML software efficiency and proposes techniques to mitigate efficiency vulnerabilities introduced by adversarial data. At the model level, it explores the robustness of
dynamic neural networks (DyNNs) to efficiency degradation and presents methods to enhance
their inference efficiency. At the program level, it introduces a novel approach to bridge
program-level and model-level optimizations, ensuring comprehensive efficiency improvements.
Moreover, it also analyze the model leakage in the model acceleration process.
The contributions of this dissertation are threefold:
Data-level: This research evaluates the impact of training data on ML model efficiency,
identifying efficiency backdoor vulnerabilities in DyNNs and proposing strategies to defend
against them. Model-level: It examines computational efficiency vulnerabilities in DyNN
architectures, developing tools like NMTSloth and DeepPerform to test and mitigate these
vulnerabilities. Program-level: This dissertation introduces a program rewriting approach,
DyCL, designed to adapt existing DL compilers for DyNNs, significantly enhancing inference
speed. Additionally, this dissertation proposes an automatic method, NNReverse, which
infers the semantics of the optimized binary program to reconstruct the DNN model from
the compiled program, thereby quantifying model leakage risks.
Overall, this dissertation provides a comprehensive framework for optimizing ML software
efficiency, integrating data, model, and program-level approaches
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.