3,221 research outputs found
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
With the breakthroughs in deep learning, the recent years have witnessed a
booming of artificial intelligence (AI) applications and services, spanning
from personal assistant to recommendation systems to video/audio surveillance.
More recently, with the proliferation of mobile computing and
Internet-of-Things (IoT), billions of mobile and IoT devices are connected to
the Internet, generating zillions Bytes of data at the network edge. Driving by
this trend, there is an urgent need to push the AI frontiers to the network
edge so as to fully unleash the potential of the edge big data. To meet this
demand, edge computing, an emerging paradigm that pushes computing tasks and
services from the network core to the network edge, has been widely recognized
as a promising solution. The resulted new inter-discipline, edge AI or edge
intelligence, is beginning to receive a tremendous amount of interest. However,
research on edge intelligence is still in its infancy stage, and a dedicated
venue for exchanging the recent advances of edge intelligence is highly desired
by both the computer system and artificial intelligence communities. To this
end, we conduct a comprehensive survey of the recent research efforts on edge
intelligence. Specifically, we first review the background and motivation for
artificial intelligence running at the network edge. We then provide an
overview of the overarching architectures, frameworks and emerging key
technologies for deep learning model towards training/inference at the network
edge. Finally, we discuss future research opportunities on edge intelligence.
We believe that this survey will elicit escalating attentions, stimulate
fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang,
"Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge
Computing," Proceedings of the IEE
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
Deep Learning (DL) has had an immense success in the recent past, leading to
state-of-the-art results in various domains such as image recognition and
natural language processing. One of the reasons for this success is the
increasing size of DL models and the proliferation of vast amounts of training
data being available. To keep on improving the performance of DL, increasing
the scalability of DL systems is necessary. In this survey, we perform a broad
and thorough investigation on challenges, techniques and tools for scalable DL
on distributed infrastructures. This incorporates infrastructures for DL,
methods for parallel DL training, multi-tenant resource scheduling and the
management of training and model data. Further, we analyze and compare 11
current open-source DL frameworks and tools and investigate which of the
techniques are commonly implemented in practice. Finally, we highlight future
research trends in DL systems that deserve further research.Comment: accepted at ACM Computing Surveys, to appea
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training
Distributed training of deep nets is an important technique to address some
of the present day computing challenges like memory consumption and
computational demands. Classical distributed approaches, synchronous or
asynchronous, are based on the parameter server architecture, i.e., worker
nodes compute gradients which are communicated to the parameter server while
updated parameters are returned. Recently, distributed training with AllReduce
operations gained popularity as well. While many of those operations seem
appealing, little is reported about wall-clock training time improvements. In
this paper, we carefully analyze the AllReduce based setup, propose timing
models which include network latency, bandwidth, cluster size and compute time,
and demonstrate that a pipelined training with a width of two combines the best
of both synchronous and asynchronous training. Specifically, for a setup
consisting of a four-node GPU cluster we show wall-clock time training
improvements of up to 5.4x compared to conventional approaches.Comment: Accepted at NeurIPS 201
Knowledge Transferring via Model Aggregation for Online Social Care
The Internet and the Web are being increasingly used in proactive social care
to provide people, especially the vulnerable, with a better life and services,
and their derived social services generate enormous data. However, the strict
protection of privacy makes user's data become an isolated island and limits
the predictive performance of standalone clients. To enable effective proactive
social care and knowledge sharing within intelligent agents, this paper
develops a knowledge transferring framework via model aggregation. Under this
framework, distributed clients perform on-device training, and a third-party
server integrates multiple clients' models and redistributes to clients for
knowledge transferring among users. To improve the generalizability of the
knowledge sharing, we further propose a novel model aggregation algorithm,
namely the average difference descent aggregation (AvgDiffAgg for short). In
particular, to evaluate the effectiveness of the learning algorithm, we use a
case study on the early detection and prevention of suicidal ideation, and the
experiment results on four datasets derived from social communities demonstrate
the effectiveness of the proposed learning method
An AI Based Super Nodes Selection Algorithm in BlockChain Networks
In blockchain systems, especially cryptographic currencies such as Bitcoin,
the double-spending and Byzantine-general-like problem are solved by reaching
consensus protocols among all nodes. The state-of-the-art protocols include
Proof-of-Work, Proof-of-Stake and Delegated-Proof-of-Stake. Proof-of-Work urges
nodes to prove their computing power measured in hash rate in a crypto-puzzle
solving competition. The other two take into account the amount of stake of
each nodes and even design a vote in Delegated-Proof-of-Stake. However, these
frameworks have several drawbacks, such as consuming a large number of
electricity, leading the whole blockchain to a centralized system and so on. In
this paper, we propose the conceptual framework, fundamental theory and
research methodology, based on artificial intelligence technology that exploits
nearly complementary information of each nodes. And we designed a particular
convolutional neural network and a dynamic threshold, which obtained the super
nodes and the random nodes, to reach the consensus. Experimental results
demonstrate that our framework combines the advantages of Proof-of-Work,
Proof-of-Stake and Delegated-Proof-of-Stake by avoiding complicated hash
operation and monopoly. Furthermore, it compares favorably to the three
state-of-the-art consensus frameworks, in terms of security and the speed of
transaction confirmation
Learning Vision-based Cohesive Flight in Drone Swarms
This paper presents a data-driven approach to learning vision-based
collective behavior from a simple flocking algorithm. We simulate a swarm of
quadrotor drones and formulate the controller as a regression problem in which
we generate 3D velocity commands directly from raw camera images. The dataset
is created by simultaneously acquiring omnidirectional images and computing the
corresponding control command from the flocking algorithm. We show that a
convolutional neural network trained on the visual inputs of the drone can
learn not only robust collision avoidance but also coherence of the flock in a
sample-efficient manner. The neural controller effectively learns to localize
other agents in the visual input, which we show by visualizing the regions with
the most influence on the motion of an agent. This weakly supervised saliency
map can be computed efficiently and may be used as a prior for subsequent
detection and relative localization of other agents. We remove the dependence
on sharing positions among flock members by taking only local visual
information into account for control. Our work can therefore be seen as the
first step towards a fully decentralized, vision-based flock without the need
for communication or visual markers to aid detection of other agents
Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios
In this paper, we present a decentralized sensor-level collision avoidance
policy for multi-robot systems, which shows promising results in practical
applications. In particular, our policy directly maps raw sensor measurements
to an agent's steering commands in terms of the movement velocity. As a first
step toward reducing the performance gap between decentralized and centralized
methods, we present a multi-scenario multi-stage training framework to learn an
optimal policy. The policy is trained over a large number of robots in rich,
complex environments simultaneously using a policy gradient based reinforcement
learning algorithm. The learning algorithm is also integrated into a hybrid
control framework to further improve the policy's robustness and effectiveness.
We validate the learned sensor-level collision avoidance policy in a variety
of simulated and real-world scenarios with thorough performance evaluations for
large-scale multi-robot systems. The generalization of the learned policy is
verified in a set of unseen scenarios including the navigation of a group of
heterogeneous robots and a large-scale scenario with 100 robots. Although the
policy is trained using simulation data only, we have successfully deployed it
on physical robots with shapes and dynamics characteristics that are different
from the simulated agents, in order to demonstrate the controller's robustness
against the sim-to-real modeling error. Finally, we show that the
collision-avoidance policy learned from multi-robot navigation tasks provides
an excellent solution to the safe and effective autonomous navigation for a
single robot working in a dense real human crowd. Our learned policy enables a
robot to make effective progress in a crowd without getting stuck. Videos are
available at https://sites.google.com/view/hybridmrc
Federated Learning with Cooperating Devices: A Consensus Approach for Massive IoT Networks
Federated learning (FL) is emerging as a new paradigm to train machine
learning models in distributed systems. Rather than sharing, and disclosing,
the training dataset with the server, the model parameters (e.g. neural
networks weights and biases) are optimized collectively by large populations of
interconnected devices, acting as local learners. FL can be applied to
power-constrained IoT devices with slow and sporadic connections. In addition,
it does not need data to be exported to third parties, preserving privacy.
Despite these benefits, a main limit of existing approaches is the centralized
optimization which relies on a server for aggregation and fusion of local
parameters; this has the drawback of a single point of failure and scaling
issues for increasing network size. The paper proposes a fully distributed (or
server-less) learning approach: the proposed FL algorithms leverage the
cooperation of devices that perform data operations inside the network by
iterating local computations and mutual interactions via consensus-based
methods. The approach lays the groundwork for integration of FL within 5G and
beyond networks characterized by decentralized connectivity and computing, with
intelligence distributed over the end-devices. The proposed methodology is
verified by experimental datasets collected inside an industrial IoT
environment.Comment: This work received support from the CHIST-ERA III Grant RadioSense
(Big Data and Process Modelling for the Smart Industry - BDSI). The paper has
been accepted for publication in the IEEE Internet of Things Journal. The
current arXiv contains an additional Appendix C that describes the database
and the Python scripts. Published version:
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8950073&isnumber=670252
Efficient Decentralized Deep Learning by Dynamic Model Averaging
We propose an efficient protocol for decentralized training of deep neural
networks from distributed data sources. The proposed protocol allows to handle
different phases of model training equally well and to quickly adapt to concept
drifts. This leads to a reduction of communication by an order of magnitude
compared to periodically communicating state-of-the-art approaches. Moreover,
we derive a communication bound that scales well with the hardness of the
serialized learning problem. The reduction in communication comes at almost no
cost, as the predictive performance remains virtually unchanged. Indeed, the
proposed protocol retains loss bounds of periodically averaging schemes. An
extensive empirical evaluation validates major improvement of the trade-off
between model performance and communication which could be beneficial for
numerous decentralized learning applications, such as autonomous driving, or
voice recognition and image classification on mobile phones
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition
We propose a novel decentralized feature extraction approach in federated
learning to address privacy-preservation issues for speech recognition. It is
built upon a quantum convolutional neural network (QCNN) composed of a quantum
circuit encoder for feature extraction, and a recurrent neural network (RNN)
based end-to-end acoustic model (AM). To enhance model parameter protection in
a decentralized architecture, an input speech is first up-streamed to a quantum
computing server to extract Mel-spectrogram, and the corresponding
convolutional features are encoded using a quantum circuit algorithm with
random parameters. The encoded features are then down-streamed to the local RNN
model for the final recognition. The proposed decentralized framework takes
advantage of the quantum learning progress to secure models and to avoid
privacy leakage attacks. Testing on the Google Speech Commands Dataset, the
proposed QCNN encoder attains a competitive accuracy of 95.12% in a
decentralized model, which is better than the previous architectures using
centralized RNN models with convolutional features. We also conduct an in-depth
study of different quantum circuit encoder architectures to provide insights
into designing QCNN-based feature extractors. Neural saliency analyses
demonstrate a correlation between the proposed QCNN features, class activation
maps, and input spectrograms. We provide an implementation for future studies.Comment: Accepted to IEEE ICASSP 2021. Code is available:
https://github.com/huckiyang/QuantumSpeech-QCN
- …