Serving deep learning models in a serverless platform

Ishakian, Vatche; Muthusamy, Vinod; Slominski, Aleksander

research

Serving deep learning models in a serverless platform

Authors: Vatche Ishakian
Vinod Muthusamy
Aleksander Slominski
Publication date: 9 February 2018
Publisher
Doi

Abstract

Serverless computing has emerged as a compelling paradigm for the development and deployment of a wide range of event based cloud applications. At the same time, cloud providers and enterprise companies are heavily adopting machine learning and Artificial Intelligence to either differentiate themselves, or provide their customers with value added services. In this work we evaluate the suitability of a serverless computing environment for the inferencing of large neural network models. Our experimental evaluations are executed on the AWS Lambda environment using the MxNet deep learning framework. Our experimental results show that while the inferencing latency can be within an acceptable range, longer delays due to cold starts can skew the latency distribution and hence risk violating more stringent SLAs

Similar works

Full text

Available Versions

Crossref

info:doi/10.1109%2Fic2e.2018.0...

Last time updated on 10/08/2021