1 research outputs found
Inference Time Optimization Using BranchyNet Partitioning
Deep Neural Network (DNN) applications with edge computing presents a
trade-off between responsiveness and computational resources. On one hand, edge
computing can provide high responsiveness deploying computational resources
close to end devices, which may be prohibitive for the majority of cloud
computing services. On the other hand, DNN inference requires computational
power to be executed, which may not be available on edge devices, but a cloud
server can provide it. To solve this problem (trade-off), we partition a DNN
between edge device and cloud server, which means the first DNN layers are
processed at the edge and the other layers at the cloud. This paper proposes an
optimal partition of DNN, according to network bandwidth, computational
resources of edge and cloud, and parameter inherent to data. Our proposal aims
to minimize the inference time, to allow high responsiveness applications. To
this end, we show the equivalency between DNN partitioning problem and shortest
path problem to find an optimal solution, using Dijkstra's algorithm.Comment: 8 pages, 11 figures, IEEE Symposium on Computers and Communications
202