Search CORE

280 research outputs found

Optimizing Replacement Policies for Content Delivery Network Caching: Beyond Belady to Attain A Seemingly Unattainable Byte Miss Ratio

Author: Liu Yu
Wang Peng
Publication venue
Publication date: 27/12/2022
Field of study

When facing objects/files of differing sizes in content delivery networks (CDNs) caches, pursuing an optimal object miss ratio (OMR) by approximating Belady no longer ensures an optimal byte miss ratio (BMR), creating confusion about how to achieve a superior BMR in CDNs. To address this issue, we experimentally observe that there exists a time window to delay the eviction of the object with the longest reuse distance to improve BMR without increasing OMR. As a result, we introduce a deep reinforcement learning (RL) model to capture this time window by dynamically monitoring the changes in OMR and BMR, and implementing a BMR-friendly policy in the time window. Based on this policy, we propose a Belady and Size Eviction (LRU-BaSE) algorithm, reducing BMR while maintaining OMR. To make LRU-BaSE efficient and practical, we address the feedback delay problem of RL with a two-pronged approach. On the one hand, our observation of a rear section of the LRU cache queue containing most of the eviction candidates allows LRU-BaSE to shorten the decision region. On the other hand, the request distribution on CDNs makes it feasible to divide the learning region into multiple sub-regions that are each learned with reduced time and increased accuracy. In real CDN systems, compared to LRU, LRU-BaSE can reduce "backing to OS" traffic and access latency by 30.05\% and 17.07\%, respectively, on average. The results on the simulator confirm that LRU-BaSE outperforms the state-of-the-art cache replacement policies, where LRU-BaSE's BMR is 0.63\% and 0.33\% less than that of Belady and Practical Flow-based Offline Optimal (PFOO), respectively, on average. In addition, compared to Learning Relaxed Belady (LRB), LRU-BaSE can yield relatively stable performance when facing workload drift

arXiv.org e-Print Archive

Deep Learning for Edge Computing Applications: A State-of-the-Art Survey

Author: Liu Jiangchuan
Ma Xiaoqiang
Wang Fangxin
Wang Xiangxiang
Zhang Miao
Publication venue
Publication date: 23/03/2020
Field of study

With the booming development of Internet-of-Things (IoT) and communication technologies such as 5G, our future world is envisioned as an interconnected entity where billions of devices will provide uninterrupted service to our daily lives and the industry. Meanwhile, these devices will generate massive amounts of valuable data at the network edge, calling for not only instant data processing but also intelligent data analysis in order to fully unleash the potential of the edge big data. Both the traditional cloud computing and on-device computing cannot sufficiently address this problem due to the high latency and the limited computation capacity, respectively. Fortunately, the emerging edge computing sheds a light on the issue by pushing the data processing from the remote network core to the local network edge, remarkably reducing the latency and improving the efficiency. Besides, the recent breakthroughs in deep learning have greatly facilitated the data processing capacity, enabling a thrilling development of novel applications, such as video surveillance and autonomous driving. The convergence of edge computing and deep learning is believed to bring new possibilities to both interdisciplinary researches and industrial applications. In this article, we provide a comprehensive survey of the latest efforts on the deep-learning-enabled edge computing applications and particularly offer insights on how to leverage the deep learning advances to facilitate edge applications from four domains, i.e., smart multimedia, smart transportation, smart city, and smart industry. We also highlight the key research challenges and promising research directions therein. We believe this survey will inspire more researches and contributions in this promising field

Simon Fraser University Institutional Repository

Edge Replication Strategies for Wide-Area Distributed Processing

Author: Feldmann Anja
Rost Matthias
Semmler Niklas
Smaragdakis Georgios
Publication venue
Publication date: 01/01/2020
Field of study

The rapid digitalization across industries comes with many challenges. One key problem is how the ever-growing and volatile data generated at distributed locations can be efficiently processed to inform decision making and improve products. Unfortunately, wide-area network capacity cannot cope with the growth of the data at the network edges. Thus, it is imperative to decide which data should be processed in-situ at the edge and which should be transferred and analyzed in data centers. In this paper, we study two families of proactive online data replication strategies, namely ski-rental and machine learning algorithms, to decide which data is processed at the edge, close to where it is generated, and which is transferred to a data center. Our analysis using real query traces from a Global 2000 company shows that such online replication strategies can significantly reduce data transfer volume in many cases up to 50% compared to naive approaches and achieve close to optimal performance. After analyzing their shortcomings for ease of use and performance, we propose a hybrid strategy that combines the advantages of both competitive and machine learning algorithms.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNetBMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and DataBMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and Dat

DepositOnce

MPG.PuRe

SoK: Distributed Computing in ICN

Author: Geng Wei
Hui Pan
Kumar Abhishek
Kutscher Dirk
Tarkoma Sasu
Zhang Yulong
Publication venue
Publication date: 16/09/2023
Field of study

Information-Centric Networking (ICN), with its data-oriented operation and generally more powerful forwarding layer, provides an attractive platform for distributed computing. This paper provides a systematic overview and categorization of different distributed computing approaches in ICN encompassing fundamental design principles, frameworks and orchestration, protocols, enablers, and applications. We discuss current pain points in legacy distributed computing, attractive ICN features, and how different systems use them. This paper also provides a discussion of potential future work for distributed computing in ICN.Comment: 10 pages, 3 figures, 1 table. Accepted by ACM ICN 202

arXiv.org e-Print Archive

Optimized and Automated Machine Learning Techniques Towards IoT Data Analytics and Cybersecurity

Author: Yang Li
Publication venue: Scholarship@Western
Publication date: 18/08/2022
Field of study

The Internet-of-Things (IoT) systems have emerged as a prevalent technology in our daily lives. With the wide spread of sensors and smart devices in recent years, the data generation volume and speed of IoT systems have increased dramatically. In most IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges. The first challenge is to process large amounts of dynamic IoT data to make accurate and informed decisions. The second challenge is to automate and optimize the data analytics process. The third challenge is to protect IoT devices and systems against various cyber threats and attacks. To address the IoT data analytics challenges, this thesis proposes various ML-based frameworks and data analytics approaches in several applications. Specifically, the first part of the thesis provides a comprehensive review of applying Automated Machine Learning (AutoML) techniques to IoT data analytics tasks. It discusses all procedures of the general ML pipeline. The second part of the thesis proposes several supervised ML-based novel Intrusion Detection Systems (IDSs) to improve the security of the Internet of Vehicles (IoV) systems and connected vehicles. Optimization techniques are used to obtain optimized ML models with high attack detection accuracy. The third part of the thesis developed unsupervised ML algorithms to identify network anomalies and malicious network entities (e.g., attacker IPs, compromised machines, and polluted files/content) to protect Content Delivery Networks (CDNs) from service targeting attacks, including distributed denial of service and cache pollution attacks. The proposed framework is evaluated on real-world CDN access log data to illustrate its effectiveness. The fourth part of the thesis proposes adaptive online learning algorithms for addressing concept drift issues (i.e., data distribution changes) and effectively handling dynamic IoT data streams in order to provide reliable IoT services. The development of drift adaptive learning methods can effectively adapt to data distribution changes and avoid data analytics model performance degradation

Scholarship@Western

Recommended from our members

Towards Optimized Traffic Provisioning and Adaptive Cache Management for Content Delivery

Author: Sundarrajan Aditya
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/03/2020
Field of study

Content delivery networks (CDNs) deploy hundreds of thousands of servers around the world to cache and serve trillions of user requests every day for a diverse set of content such as web pages, videos, software downloads and images. In this dissertation, we propose algorithms to provision traffic across cache servers and manage the content they host to achieve performance objectives such as maximizing the cache hit rate, minimizing the bandwidth cost of the network and minimizing the energy consumption of the servers. Traffic provisioning is the process of determining the set of content domains hosted on the servers. We propose footprint descriptors that effectively capture the popularity characteristics and caching performance of different content classes. We also propose a footprint descriptor calculus that can be used to decide how content should be mixed or partitioned to efficiently provision traffic. To automate traffic provisioning, we propose optimization models to provision traffic such that the cache miss traffic from the network is minimized without overloading the servers. We find that such optimization models produce significant reductions in the cache miss traffic when compared with traffic provisioning algorithms in use today. Cache management is the process of deciding how content is cached in the servers of a CDN. We propose TTL-based caching algorithms that provably achieve performance targets specified by a CDN operator. We show that the proposed algorithms converge to the target hit rate and target cache size with low error. Finally, we propose cache management algorithms to make the servers energy-efficient using disk shutdown. We find that disk shutdown is well suited for CDN servers and provides energy savings without significantly impacting cache hit rates

ScholarWorks@UMass Amherst

Responsible, Automated Data Gathering for Timely Citizen Dashboard Provision During a Global Pandemic (COVID-19

Author: Beigl Michael
Budde Matthias
Dörner Dominik
Röddiger Tobias
Publication venue
Publication date: 02/12/2020
Field of study

Creating a public understanding of the dynamics of a pandemic, such as COVID-19, is vital for introducing restrictive regulations. Gathering diverse data responsibly and sharing it with experts and citizens in a timely manner is challenging. This article reviews methodologies of COVID-19 dashboard design and discusses both technical and non-technical challenges associated. Advice and lessons learned from building a citizen-focused, automated county-precision dashboard for Germany are shared. Within four months, the web-based tool had 5 million unique visitors and 70 million sessions. Three developers set up the basic version in less than one week. Early on, data was screen scraped. An iterative process improved timeliness by adding more fine-grained data sources. A collaborative online table editor enabled near real-time corrections. Alerting was setup for errors, and statistics apply for sanity checking. Static site generation and a content delivery network help to serve large user loads in a timely manner. The flexible design allowed to iteratively integrate more complex statistics based on expert knowledge built on top of the collected data and secondary data sources such as ICU beds and citizen movement

KITopen

Network-aware video streaming for future media Internet

Author: Viola Roberto
Publication venue
Publication date: 26/10/2021
Field of study

272 p

Archivo Digital para la Docencia y la Investigación