3,049 research outputs found
Hardware-Aware Machine Learning: Modeling and Optimization
Recent breakthroughs in Deep Learning (DL) applications have made DL models a
key component in almost every modern computing system. The increased popularity
of DL applications deployed on a wide-spectrum of platforms have resulted in a
plethora of design challenges related to the constraints introduced by the
hardware itself. What is the latency or energy cost for an inference made by a
Deep Neural Network (DNN)? Is it possible to predict this latency or energy
consumption before a model is trained? If yes, how can machine learners take
advantage of these models to design the hardware-optimal DNN for deployment?
From lengthening battery life of mobile devices to reducing the runtime
requirements of DL models executing in the cloud, the answers to these
questions have drawn significant attention.
One cannot optimize what isn't properly modeled. Therefore, it is important
to understand the hardware efficiency of DL models during serving for making an
inference, before even training the model. This key observation has motivated
the use of predictive models to capture the hardware performance or energy
efficiency of DL applications. Furthermore, DL practitioners are challenged
with the task of designing the DNN model, i.e., of tuning the hyper-parameters
of the DNN architecture, while optimizing for both accuracy of the DL model and
its hardware efficiency. Therefore, state-of-the-art methodologies have
proposed hardware-aware hyper-parameter optimization techniques. In this paper,
we provide a comprehensive assessment of state-of-the-art work and selected
results on the hardware-aware modeling and optimization for DL applications. We
also highlight several open questions that are poised to give rise to novel
hardware-aware designs in the next few years, as DL applications continue to
significantly impact associated hardware systems and platforms.Comment: ICCAD'18 Invited Pape
A Fog Robotics Approach to Deep Robot Learning: Application to Object Recognition and Grasp Planning in Surface Decluttering
The growing demand of industrial, automotive and service robots presents a
challenge to the centralized Cloud Robotics model in terms of privacy,
security, latency, bandwidth, and reliability. In this paper, we present a `Fog
Robotics' approach to deep robot learning that distributes compute, storage and
networking resources between the Cloud and the Edge in a federated manner. Deep
models are trained on non-private (public) synthetic images in the Cloud; the
models are adapted to the private real images of the environment at the Edge
within a trusted network and subsequently, deployed as a service for
low-latency and secure inference/prediction for other robots in the network. We
apply this approach to surface decluttering, where a mobile robot picks and
sorts objects from a cluttered floor by learning a deep object recognition and
a grasp planning model. Experiments suggest that Fog Robotics can improve
performance by sim-to-real domain adaptation in comparison to exclusively using
Cloud or Edge resources, while reducing the inference cycle time by 4\times to
successfully declutter 86% of objects over 213 attempts.Comment: IEEE International Conference on Robotics and Automation, ICRA, 201
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
With the breakthroughs in deep learning, the recent years have witnessed a
booming of artificial intelligence (AI) applications and services, spanning
from personal assistant to recommendation systems to video/audio surveillance.
More recently, with the proliferation of mobile computing and
Internet-of-Things (IoT), billions of mobile and IoT devices are connected to
the Internet, generating zillions Bytes of data at the network edge. Driving by
this trend, there is an urgent need to push the AI frontiers to the network
edge so as to fully unleash the potential of the edge big data. To meet this
demand, edge computing, an emerging paradigm that pushes computing tasks and
services from the network core to the network edge, has been widely recognized
as a promising solution. The resulted new inter-discipline, edge AI or edge
intelligence, is beginning to receive a tremendous amount of interest. However,
research on edge intelligence is still in its infancy stage, and a dedicated
venue for exchanging the recent advances of edge intelligence is highly desired
by both the computer system and artificial intelligence communities. To this
end, we conduct a comprehensive survey of the recent research efforts on edge
intelligence. Specifically, we first review the background and motivation for
artificial intelligence running at the network edge. We then provide an
overview of the overarching architectures, frameworks and emerging key
technologies for deep learning model towards training/inference at the network
edge. Finally, we discuss future research opportunities on edge intelligence.
We believe that this survey will elicit escalating attentions, stimulate
fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang,
"Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge
Computing," Proceedings of the IEE
Serving deep learning models in a serverless platform
Serverless computing has emerged as a compelling paradigm for the development
and deployment of a wide range of event based cloud applications. At the same
time, cloud providers and enterprise companies are heavily adopting machine
learning and Artificial Intelligence to either differentiate themselves, or
provide their customers with value added services. In this work we evaluate the
suitability of a serverless computing environment for the inferencing of large
neural network models. Our experimental evaluations are executed on the AWS
Lambda environment using the MxNet deep learning framework. Our experimental
results show that while the inferencing latency can be within an acceptable
range, longer delays due to cold starts can skew the latency distribution and
hence risk violating more stringent SLAs
EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices
In recent years, advances in deep learning have resulted in unprecedented
leaps in diverse tasks spanning from speech and object recognition to context
awareness and health monitoring. As a result, an increasing number of
AI-enabled applications are being developed targeting ubiquitous and mobile
devices. While deep neural networks (DNNs) are getting bigger and more complex,
they also impose a heavy computational and energy burden on the host devices,
which has led to the integration of various specialized processors in commodity
devices. Given the broad range of competing DNN architectures and the
heterogeneity of the target hardware, there is an emerging need to understand
the compatibility between DNN-platform pairs and the expected performance
benefits on each platform. This work attempts to demystify this landscape by
systematically evaluating a collection of state-of-the-art DNNs on a wide
variety of commodity devices. In this respect, we identify potential
bottlenecks in each architecture and provide important guidelines that can
assist the community in the co-design of more efficient DNNs and accelerators.Comment: Accepted at MobiSys 2019: 3rd International Workshop on Embedded and
Mobile Deep Learning (EMDL), 201
A Berkeley View of Systems Challenges for AI
With the increasing commoditization of computer vision, speech recognition
and machine translation systems and the widespread deployment of learning-based
back-end technologies such as digital advertising and intelligent
infrastructures, AI (Artificial Intelligence) has moved from research labs to
production. These changes have been made possible by unprecedented levels of
data and computation, by methodological advances in machine learning, by
innovations in systems software and architectures, and by the broad
accessibility of these technologies.
The next generation of AI systems promises to accelerate these developments
and increasingly impact our lives via frequent interactions and making (often
mission-critical) decisions on our behalf, often in highly personalized
contexts. Realizing this promise, however, raises daunting challenges. In
particular, we need AI systems that make timely and safe decisions in
unpredictable environments, that are robust against sophisticated adversaries,
and that can process ever increasing amounts of data across organizations and
individuals without compromising confidentiality. These challenges will be
exacerbated by the end of the Moore's Law, which will constrain the amount of
data these technologies can store and process. In this paper, we propose
several open research directions in systems, architectures, and security that
can address these challenges and help unlock AI's potential to improve lives
and society.Comment: Berkeley Technical Repor
Intelligent networking with Mobile Edge Computing: Vision and Challenges for Dynamic Network Scheduling
Mobile edge computing (MEC) has been considered as a promising technique for
internet of things (IoT). By deploying edge servers at the proximity of
devices, it is expected to provide services and process data at a relatively
low delay by intelligent networking. However, the vast edge servers may face
great challenges in terms of cooperation and resource allocation. Furthermore,
intelligent networking requires online implementation in distributed mode. In
such kinds of systems, the network scheduling can not follow any previously
known rule due to complicated application environment. Then statistical
learning rises up as a promising technique for network scheduling, where edges
dynamically learn environmental elements with cooperations. It is expected such
learning based methods may relieve deficiency of model limitations, which
enhance their practical use in dynamic network scheduling. In this paper, we
investigate the vision and challenges of the intelligent IoT networking with
mobile edge computing. From the systematic viewpoint, some major research
opportunities are enumerated with respect to statistical learning
SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
Despite the soaring use of convolutional neural networks (CNNs) in mobile
applications, uniformly sustaining high-performance inference on mobile has
been elusive due to the excessive computational demands of modern CNNs and the
increasing diversity of deployed devices. A popular alternative comprises
offloading CNN processing to powerful cloud-based servers. Nevertheless, by
relying on the cloud to produce outputs, emerging mission-critical and
high-mobility applications, such as drone obstacle avoidance or interactive
applications, can suffer from the dynamic connectivity conditions and the
uncertain availability of the cloud. In this paper, we propose SPINN, a
distributed inference system that employs synergistic device-cloud computation
together with a progressive inference method to deliver fast and robust CNN
inference across diverse settings. The proposed system introduces a novel
scheduler that co-optimises the early-exit policy and the CNN splitting at run
time, in order to adapt to dynamic conditions and meet user-defined
service-level requirements. Quantitative evaluation illustrates that SPINN
outperforms its state-of-the-art collaborative inference counterparts by up to
2x in achieved throughput under varying network conditions, reduces the server
cost by up to 6.8x and improves accuracy by 20.7% under latency constraints,
while providing robust operation under uncertain connectivity conditions and
significant energy savings compared to cloud-centric execution.Comment: Accepted at the 26th Annual International Conference on Mobile
Computing and Networking (MobiCom), 202
Edge Intelligence: Architectures, Challenges, and Applications
Edge intelligence refers to a set of connected systems and devices for data
collection, caching, processing, and analysis in locations close to where data
is captured based on artificial intelligence. The aim of edge intelligence is
to enhance the quality and speed of data processing and protect the privacy and
security of the data. Although recently emerged, spanning the period from 2011
to now, this field of research has shown explosive growth over the past five
years. In this paper, we present a thorough and comprehensive survey on the
literature surrounding edge intelligence. We first identify four fundamental
components of edge intelligence, namely edge caching, edge training, edge
inference, and edge offloading, based on theoretical and practical results
pertaining to proposed and deployed systems. We then aim for a systematic
classification of the state of the solutions by examining research results and
observations for each of the four components and present a taxonomy that
includes practical problems, adopted techniques, and application goals. For
each category, we elaborate, compare and analyse the literature from the
perspectives of adopted techniques, objectives, performance, advantages and
drawbacks, etc. This survey article provides a comprehensive introduction to
edge intelligence and its application areas. In addition, we summarise the
development of the emerging research field and the current state-of-the-art and
discuss the important open issues and possible theoretical and technical
solutions.Comment: 53 pages, 37 figures, surve
Machine Learning Systems for Intelligent Services in the IoT: A Survey
Machine learning (ML) technologies are emerging in the Internet of Things
(IoT) to provision intelligent services. This survey moves beyond existing ML
algorithms and cloud-driven design to investigate the less-explored systems,
scaling and socio-technical aspects for consolidating ML and IoT. It covers the
latest developments (up to 2020) on scaling and distributing ML across cloud,
edge, and IoT devices. With a multi-layered framework to classify and
illuminate system design choices, this survey exposes fundamental concerns of
developing and deploying ML systems in the rising cloud-edge-device continuum
in terms of functionality, stakeholder alignment and trustworthiness.Comment: Requires rewor
- …