2 research outputs found
The Design and Implementation of a Scalable DL Benchmarking Platform
The current Deep Learning (DL) landscape is fast-paced and is rife with
non-uniform models, hardware/software (HW/SW) stacks, but lacks a DL
benchmarking platform to facilitate evaluation and comparison of DL
innovations, be it models, frameworks, libraries, or hardware. Due to the lack
of a benchmarking platform, the current practice of evaluating the benefits of
proposed DL innovations is both arduous and error-prone - stifling the adoption
of the innovations.
In this work, we first identify design features which are desirable
within a DL benchmarking platform. These features include: performing the
evaluation in a consistent, reproducible, and scalable manner, being framework
and hardware agnostic, supporting real-world benchmarking workloads, providing
in-depth model execution inspection across the HW/SW stack levels, etc. We then
propose MLModelScope, a DL benchmarking platform design that realizes the
objectives. MLModelScope proposes a specification to define DL model
evaluations and techniques to provision the evaluation workflow using the
user-specified HW/SW stack. MLModelScope defines abstractions for frameworks
and supports board range of DL models and evaluation scenarios. We implement
MLModelScope as an open-source project with support for all major frameworks
and hardware architectures. Through MLModelScope's evaluation and automated
analysis workflows, we performed case-study analyses of models across
systems and show how model, hardware, and framework selection affects model
accuracy and performance under different benchmarking scenarios. We further
demonstrated how MLModelScope's tracing capability gives a holistic view of
model execution and helps pinpoint bottlenecks
DLSpec: A Deep Learning Task Exchange Specification
Deep Learning (DL) innovations are being introduced at a rapid pace. However,
the current lack of standard specification of DL tasks makes sharing, running,
reproducing, and comparing these innovations difficult. To address this
problem, we propose DLSpec, a model-, dataset-, software-, and
hardware-agnostic DL specification that captures the different aspects of DL
tasks. DLSpec has been tested by specifying and running hundreds of DL tasks