76,744 research outputs found
Distributed Machine Learning through Heterogeneous Edge Systems
Many emerging AI applications request distributed machine learning (ML) among
edge systems (e.g., IoT devices and PCs at the edge of the Internet), where
data cannot be uploaded to a central venue for model training, due to their
large volumes and/or security/privacy concerns. Edge devices are intrinsically
heterogeneous in computing capacity, posing significant challenges to parameter
synchronization for parallel training with the parameter server (PS)
architecture. This paper proposes ADSP, a parameter synchronization scheme for
distributed machine learning (ML) with heterogeneous edge systems. Eliminating
the significant waiting time occurring with existing parameter synchronization
models, the core idea of ADSP is to let faster edge devices continue training,
while committing their model updates at strategically decided intervals. We
design algorithms that decide time points for each worker to commit its model
update, and ensure not only global model convergence but also faster
convergence. Our testbed implementation and experiments show that ADSP
outperforms existing parameter synchronization models significantly in terms of
ML model convergence time, scalability and adaptability to large heterogeneity.Comment: Copyright 2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserve
ARES: Adaptive Resource-Aware Split Learning for Internet of Things
Distributed training of Machine Learning models in edge Internet of Things (IoT) environments is challenging because of three main points. First, resource-constrained devices have large training times and limited energy budget. Second, resource heterogeneity of IoT devices slows down the training of the global model due to the presence of slower devices (stragglers). Finally, varying operational conditions, such as network bandwidth, and computing resources, significantly affect training time and energy consumption. Recent studies have proposed Split Learning (SL) for distributed model training with limited resources but its efficient implementation on the resource-constrained and decentralized heterogeneous IoT devices remains minimally explored. We propose Adaptive REsource-aware Splitlearning (ARES), a scheme for efficient model training in IoT systems. ARES accelerates local training in resource-constrained devices and minimizes the effect of stragglers on the training through device-targeted split points while accounting for time-varying network throughput and computing resources. ARES takes into account application constraints to mitigate training optimization tradeoffs in terms of energy consumption and training time. We evaluate ARES prototype on a real testbed comprising heterogeneous IoT devices running a widely-adopted deep neural network and dataset. Results show that ARES accelerates model training on IoT devices by up to 48% and minimizes the energy consumption by up to 61.4% compared to Federated Learning (FL) and classic SL, without sacrificing the model convergence and accurac
Internet of robotic things : converging sensing/actuating, hypoconnectivity, artificial intelligence and IoT Platforms
The Internet of Things (IoT) concept is evolving rapidly and influencing newdevelopments in various application domains, such as the Internet of MobileThings (IoMT), Autonomous Internet of Things (A-IoT), Autonomous Systemof Things (ASoT), Internet of Autonomous Things (IoAT), Internetof Things Clouds (IoT-C) and the Internet of Robotic Things (IoRT) etc.that are progressing/advancing by using IoT technology. The IoT influencerepresents new development and deployment challenges in different areassuch as seamless platform integration, context based cognitive network integration,new mobile sensor/actuator network paradigms, things identification(addressing, naming in IoT) and dynamic things discoverability and manyothers. The IoRT represents new convergence challenges and their need to be addressed, in one side the programmability and the communication ofmultiple heterogeneous mobile/autonomous/robotic things for cooperating,their coordination, configuration, exchange of information, security, safetyand protection. Developments in IoT heterogeneous parallel processing/communication and dynamic systems based on parallelism and concurrencyrequire new ideas for integrating the intelligent “devices”, collaborativerobots (COBOTS), into IoT applications. Dynamic maintainability, selfhealing,self-repair of resources, changing resource state, (re-) configurationand context based IoT systems for service implementation and integrationwith IoT network service composition are of paramount importance whennew “cognitive devices” are becoming active participants in IoT applications.This chapter aims to be an overview of the IoRT concept, technologies,architectures and applications and to provide a comprehensive coverage offuture challenges, developments and applications
Next Generation Cloud Computing: New Trends and Research Directions
The landscape of cloud computing has significantly changed over the last
decade. Not only have more providers and service offerings crowded the space,
but also cloud infrastructure that was traditionally limited to single provider
data centers is now evolving. In this paper, we firstly discuss the changing
cloud infrastructure and consider the use of infrastructure from multiple
providers and the benefit of decentralising computing away from data centers.
These trends have resulted in the need for a variety of new computing
architectures that will be offered by future cloud infrastructure. These
architectures are anticipated to impact areas, such as connecting people and
devices, data-intensive computing, the service space and self-learning systems.
Finally, we lay out a roadmap of challenges that will need to be addressed for
realising the potential of next generation cloud systems.Comment: Accepted to Future Generation Computer Systems, 07 September 201
Sparse Allreduce: Efficient Scalable Communication for Power-Law Data
Many large datasets exhibit power-law statistics: The web graph, social
networks, text data, click through data etc. Their adjacency graphs are termed
natural graphs, and are known to be difficult to partition. As a consequence
most distributed algorithms on these graphs are communication intensive. Many
algorithms on natural graphs involve an Allreduce: a sum or average of
partitioned data which is then shared back to the cluster nodes. Examples
include PageRank, spectral partitioning, and many machine learning algorithms
including regression, factor (topic) models, and clustering. In this paper we
describe an efficient and scalable Allreduce primitive for power-law data. We
point out scaling problems with existing butterfly and round-robin networks for
Sparse Allreduce, and show that a hybrid approach improves on both.
Furthermore, we show that Sparse Allreduce stages should be nested instead of
cascaded (as in the dense case). And that the optimum throughput Allreduce
network should be a butterfly of heterogeneous degree where degree decreases
with depth into the network. Finally, a simple replication scheme is introduced
to deal with node failures. We present experiments showing significant
improvements over existing systems such as PowerGraph and Hadoop
- …