13,607 research outputs found
Latency Optimization in Large-Scale Cloud-Sensor Systems
With the advent of the Internet of Things and smart city applications, massive cyber-physical interactions between the applications hosted in the cloud and a huge number of external physical sensors and devices is an inevitable situation. This raises two main challenges: cloud cost affordability as the smart city grows (referred to as economical cloud scalability) and the energy-efficient operation of sensor hardware. We have developed Cloud-Edge-Beneath (CEB), a multi-tier architecture for large-scale IoT deployments, embodying distributed optimizations, which address these two major challenges. In this article, we summarize our prior work on CEB to set context for presenting a third major challenge for cloud sensor-systems, which is latency. Prolonged latency can potentially arise in servicing requests from cloud applications, especially given our primary focus on optimizing energy and cloud scalability. Latency, however, is an important factor to optimize for real-time and cyber-physical applications with limited tolerance to delays. Also, improving the responsiveness of any IoT application is bound to improve the user experience and hence the acceptability and adoption of smart city solutions by the city citizens. In this article, we aim to give a formal definition and formulation for the latency optimization problem under CEB. We propose a Prioritized Application Fragment Caching Algorithm (PAFCA) to selectively cache application fragments from the cloud to lower layers of CEB, as a key measure to optimize latency. The algorithm itself is an extension of one of the existing optimization algorithms of CEB (AFCA-1). As will be shown, PAFCA takes into account the expectations of cloud applications on real-timeliness of responses. Through experiments, we measure and validate the effect of PAFCA on latency and cloud scalability. We also introduce and discuss the trade-off between latency and sensor energy in this given context
Adaptive Energy-aware Scheduling of Dynamic Event Analytics across Edge and Cloud Resources
The growing deployment of sensors as part of Internet of Things (IoT) is
generating thousands of event streams. Complex Event Processing (CEP) queries
offer a useful paradigm for rapid decision-making over such data sources. While
often centralized in the Cloud, the deployment of capable edge devices on the
field motivates the need for cooperative event analytics that span Edge and
Cloud computing. Here, we identify a novel problem of query placement on edge
and Cloud resources for dynamically arriving and departing analytic dataflows.
We define this as an optimization problem to minimize the total makespan for
all event analytics, while meeting energy and compute constraints of the
resources. We propose 4 adaptive heuristics and 3 rebalancing strategies for
such dynamic dataflows, and validate them using detailed simulations for 100 -
1000 edge devices and VMs. The results show that our heuristics offer
O(seconds) planning time, give a valid and high quality solution in all cases,
and reduce the number of query migrations. Furthermore, rebalance strategies
when applied in these heuristics have significantly reduced the makespan by
around 20 - 25%.Comment: 11 pages, 7 figure
JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution
Recent years have witnessed a rapid growth of deep-network based services and
applications. A practical and critical problem thus has emerged: how to
effectively deploy the deep neural network models such that they can be
executed efficiently. Conventional cloud-based approaches usually run the deep
models in data center servers, causing large latency because a significant
amount of data has to be transferred from the edge of network to the data
center. In this paper, we propose JALAD, a joint accuracy- and latency-aware
execution framework, which decouples a deep neural network so that a part of it
will run at edge devices and the other part inside the conventional cloud,
while only a minimum amount of data has to be transferred between them. Though
the idea seems straightforward, we are facing challenges including i) how to
find the best partition of a deep structure; ii) how to deploy the component at
an edge device that only has limited computation power; and iii) how to
minimize the overall execution latency. Our answers to these questions are a
set of strategies in JALAD, including 1) A normalization based in-layer data
compression strategy by jointly considering compression rate and model
accuracy; 2) A latency-aware deep decoupling strategy to minimize the overall
execution latency; and 3) An edge-cloud structure adaptation strategy that
dynamically changes the decoupling for different network conditions.
Experiments demonstrate that our solution can significantly reduce the
execution latency: it speeds up the overall inference execution with a
guaranteed model accuracy loss.Comment: conference, copyright transfered to IEE
- …