6 research outputs found
Decentralized Machine Learning based Energy Efficient Routing and Intrusion Detection in Unmanned Aerial Network (UAV)
Decentralized machine learning (FL) is a system that uses federated learning (FL). Without disclosing locally stored sensitive information, FL enables multiple clients to work together to solve conventional distributed ML problems coordinated by a central server. In order to classify FLs, this research relies heavily on machine learning and deep learning techniques. The next generation of wireless networks is anticipated to incorporate unmanned aerial vehicles (UAVs) like drones into both civilian and military applications. The use of artificial intelligence (AI), and more specifically machine learning (ML) methods, to enhance the intelligence of UAV networks is desirable and necessary for the aforementioned uses. Unfortunately, most existing FL paradigms are still centralized, with a singular entity accountable for network-wide ML model aggregation and fusion. This is inappropriate for UAV networks, which frequently feature unreliable nodes and connections, and provides a possible single point of failure. There are many challenges by using high mobility of UAVs, of loss of packet frequent and difficulties in the UAV between the weak links, which affect the reliability while delivering data. An earlier UAV failure is happened by the unbalanced conception of energy and lifetime of the network is decreased; this will accelerate consequently in the overall network. In this paper, we focused mainly on the technique of security while maintaining UAV network in surveillance context, all information collected from different kinds of sources. The trust policies are based on peer-to-peer information which is confirmed by UAV network. A pre-shared UAV list or used by asymmetric encryption security in the proposal system. The wrong information can be identified when the UAV the network is hijacked physically by using this proposed technique. To provide secure routing path by using Secure Location with Intrusion Detection System (SLIDS) and conservation of energy-based prediction of link breakage done by location-based energy efficient routing (LEER) for discovering path of degree connectivity. Thus, the proposed novel architecture is named as Decentralized Federate Learning- Secure Location with Intrusion Detection System (DFL-SLIDS), which achieves 98% of routing overhead, 93% of end-to-end delay, 92% of energy efficiency, 86.4% of PDR and 97% of throughput
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging
Deep learning at scale is dominated by communication time. Distributing
samples across nodes usually yields the best performance, but poses scaling
challenges due to global information dissemination and load imbalance across
uneven sample lengths. State-of-the-art decentralized optimizers mitigate the
problem, but require more iterations to achieve the same accuracy as their
globally-communicating counterparts. We present Wait-Avoiding Group Model
Averaging (WAGMA) SGD, a wait-avoiding stochastic optimizer that reduces global
communication via subgroup weight exchange. The key insight is a combination of
algorithmic changes to the averaging scheme and the use of a group allreduce
operation. We prove the convergence of WAGMA-SGD, and empirically show that it
retains convergence rates similar to Allreduce-SGD. For evaluation, we train
ResNet-50 on ImageNet; Transformer for machine translation; and deep
reinforcement learning for navigation at scale. Compared with state-of-the-art
decentralized SGD variants, WAGMA-SGD significantly improves training
throughput (e.g., 2.1x on 1,024 GPUs for reinforcement learning), and achieves
the fastest time-to-solution (e.g., the highest score using the shortest
training time for Transformer).Comment: Published in IEEE Transactions on Parallel and Distributed Systems
(IEEE TPDS), vol. 32, no. 7, pp. 1725-1739, 1 July 202
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Distributed deep learning becomes very common to reduce the overall training
time by exploiting multiple computing devices (e.g., GPUs/TPUs) as the size of
deep models and data sets increases. However, data communication between
computing devices could be a potential bottleneck to limit the system
scalability. How to address the communication problem in distributed deep
learning is becoming a hot research topic recently. In this paper, we provide a
comprehensive survey of the communication-efficient distributed training
algorithms in both system-level and algorithmic-level optimizations. In the
system-level, we demystify the system design and implementation to reduce the
communication cost. In algorithmic-level, we compare different algorithms with
theoretical convergence bounds and communication complexity. Specifically, we
first propose the taxonomy of data-parallel distributed training algorithms,
which contains four main dimensions: communication synchronization, system
architectures, compression techniques, and parallelism of communication and
computing. Then we discuss the studies in addressing the problems of the four
dimensions to compare the communication cost. We further compare the
convergence rates of different algorithms, which enable us to know how fast the
algorithms can converge to the solution in terms of iterations. According to
the system-level communication cost analysis and theoretical convergence speed
comparison, we provide the readers to understand what algorithms are more
efficient under specific distributed environments and extrapolate potential
directions for further optimizations