1 research outputs found
Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks
In this paper, we provide a fine-grain machine learning-based method,
PerfNetV2, which improves the accuracy of our previous work for modeling the
neural network performance on a variety of GPU accelerators. Given an
application, the proposed method can be used to predict the inference time and
training time of the convolutional neural networks used in the application,
which enables the system developer to optimize the performance by choosing the
neural networks and/or incorporating the hardware accelerators to deliver
satisfactory results in time. Furthermore, the proposed method is capable of
predicting the performance of an unseen or non-existing device, e.g. a new GPU
which has a higher operating frequency with less processor cores, but more
memory capacity. This allows a system developer to quickly search the hardware
design space and/or fine-tune the system configuration. Compared to the
previous works, PerfNetV2 delivers more accurate results by modeling detailed
host-accelerator interactions in executing the full neural networks and
improving the architecture of the machine learning model used in the predictor.
Our case studies show that PerfNetV2 yields a mean absolute percentage error
within 13.1% on LeNet, AlexNet, and VGG16 on NVIDIA GTX-1080Ti, while the error
rate on a previous work published in ICBD 2018 could be as large as 200%