398 research outputs found

    Edge enhanced deep learning system for large-scale video stream analytics.

    Get PDF
    Applying deep learning models to large-scale IoT data is a compute-intensive task and needs significant computational resources. Existing approaches transfer this big data from IoT devices to a central cloud where inference is performed using a machine learning model. However, the network connecting the data capture source and the cloud platform can become a bottleneck. We address this problem by distributing the deep learning pipeline across edge and cloudlet/fog resources. The basic processing stages and trained models are distributed towards the edge of the network and on in-transit and cloud resources. The proposed approach performs initial processing of the data close to the data source at edge and fog nodes, resulting in significant reduction in the data that is transferred and stored in the cloud. Results on an object recognition scenario show 71\% efficiency gain in the throughput of the system by employing a combination of edge, in-transit and cloud resources when compared to a cloud-only approach.N/

    Orchestrating Service Migration for Low Power MEC-Enabled IoT Devices

    Full text link
    Multi-Access Edge Computing (MEC) is a key enabling technology for Fifth Generation (5G) mobile networks. MEC facilitates distributed cloud computing capabilities and information technology service environment for applications and services at the edges of mobile networks. This architectural modification serves to reduce congestion, latency, and improve the performance of such edge colocated applications and devices. In this paper, we demonstrate how reactive service migration can be orchestrated for low-power MEC-enabled Internet of Things (IoT) devices. Here, we use open-source Kubernetes as container orchestration system. Our demo is based on traditional client-server system from user equipment (UE) over Long Term Evolution (LTE) to the MEC server. As the use case scenario, we post-process live video received over web real-time communication (WebRTC). Next, we integrate orchestration by Kubernetes with S1 handovers, demonstrating MEC-based software defined network (SDN). Now, edge applications may reactively follow the UE within the radio access network (RAN), expediting low-latency. The collected data is used to analyze the benefits of the low-power MEC-enabled IoT device scheme, in which end-to-end (E2E) latency and power requirements of the UE are improved. We further discuss the challenges of implementing such schemes and future research directions therein

    Demo: An experimental environment based on mini-PCs for federated learning research

    Get PDF
    There is a growing research interest in Federated Learning (FL), a promising approach for data privacy preservation and proximity of training to the network edge, where data is generated. Resource consumption for Machine Learning (ML) training and inference is important for edge nodes, but most of the proposed protocols and algorithms for FL are evaluated by simulations. In this demo paper, we present an environment based on distributed mini-PCs to enable experimental study of FL protocols and algorithms. We have installed low-capacity mini-PCs within a wireless city-level mesh network and deployed container-based FL components on these nodes. We show the deployed FL clients and server at different nodes in the city and demonstrate how an FL experiment can be set and run in a real environment.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871582 — NGIatlantic.eu and was partially supported by the Spanish Government under contracts PID2019-106774RB-C21, PCI2019-111851-2 (LeadingEdge CHIST-ERA), PCI2019-111850-2 (DiPET CHIST-ERA). The work of C.-H. Liu was supported in part by the U.S. National Science Foundation (NSF) under Award CNS-2006453 and in part by Mississippi State University under Grant ORED 253551-060702. The work of L. Wei is supported in part by the U.S. National Science Foundation (#2150486 and #2006612). I Koutsopoulos acknowledges support from the CHIST-ERA grant CHIST-ERA-18-SDCDN-004 (GSRI grant number T11EPA4-00056).Peer ReviewedPostprint (author's final draft

    BePOCH: Improving federated learning performance in resource-constrained computing devices

    Get PDF
    Inference with trained machine learning models is now possible with small computing devices while only a few years ago it was run mostly in the cloud only. The recent technique of Federated Learning offers now a way to do also the training of the machine learning models on small devices by distributing the computing effort needed for the training over many distributed machines. But, the training on these low-capacity devices takes a long time and often consumes all the available CPU resource of the device. Therefore, for Federated Learning to be done by low-capacity devices in practical environments, the training process must not only target for the highest accuracy, but also on reducing the training time and the resource consumption. In this paper, we present an approach which uses a dynamic epoch parameter in the model training. We propose the BePOCH (Best Epoch) algorithm to identify what is the best number of epochs per training round in Federated Learning. We show in experiments with medical datasets how with the BePOCH suggested number of epochs, the training time and resource consumption decreases while keeping the level of accuracy. Thus, BePOCH makes machine learning model training on low-capacity devices more feasible and furthermore, decreases the overall resource consumption of the training process, which is an important asnect towards greener machine learning techniques.This work was partially funded by the Spanish Government under contracts PID2019-106774RB-C21, PCI2019-111850- 2 (DiPET CHIST-ERA), PCI2019-111851-2 (LeadingEdge CHIST-ERA), and the Generalitat de Catalunya as Consolidated Research Group 2017-SGR-990. Suport was given also by the Agency for Electronic Communications (AEK) of North Macedonia.Peer ReviewedPostprint (author's final draft

    Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform

    Full text link
    Advances in detectors and computational technologies provide new opportunities for applied research and the fundamental sciences. Concurrently, dramatic increases in the three Vs (Volume, Velocity, and Variety) of experimental data and the scale of computational tasks produced the demand for new real-time processing systems at experimental facilities. Recently, this demand was addressed by the Spark-MPI approach connecting the Spark data-intensive platform with the MPI high-performance framework. In contrast with existing data management and analytics systems, Spark introduced a new middleware based on resilient distributed datasets (RDDs), which decoupled various data sources from high-level processing algorithms. The RDD middleware significantly advanced the scope of data-intensive applications, spreading from SQL queries to machine learning to graph processing. Spark-MPI further extended the Spark ecosystem with the MPI applications using the Process Management Interface. The paper explores this integrated platform within the context of online ptychographic and tomographic reconstruction pipelines.Comment: New York Scientific Data Summit, August 6-9, 201

    Hyper: Distributed Cloud Processing for Large-Scale Deep Learning Tasks

    Full text link
    Training and deploying deep learning models in real-world applications require processing large amounts of data. This is a challenging task when the amount of data grows to a hundred terabytes, or even, petabyte-scale. We introduce a hybrid distributed cloud framework with a unified view to multiple clouds and an on-premise infrastructure for processing tasks using both CPU and GPU compute instances at scale. The system implements a distributed file system and failure-tolerant task processing scheduler, independent of the language and Deep Learning framework used. It allows to utilize unstable cheap resources on the cloud to significantly reduce costs. We demonstrate the scalability of the framework on running pre-processing, distributed training, hyperparameter search and large-scale inference tasks utilizing 10,000 CPU cores and 300 GPU instances with the overall processing power of 30 petaflops
    • …
    corecore