3,221 research outputs found

    Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing

    Full text link
    With the breakthroughs in deep learning, the recent years have witnessed a booming of artificial intelligence (AI) applications and services, spanning from personal assistant to recommendation systems to video/audio surveillance. More recently, with the proliferation of mobile computing and Internet-of-Things (IoT), billions of mobile and IoT devices are connected to the Internet, generating zillions Bytes of data at the network edge. Driving by this trend, there is an urgent need to push the AI frontiers to the network edge so as to fully unleash the potential of the edge big data. To meet this demand, edge computing, an emerging paradigm that pushes computing tasks and services from the network core to the network edge, has been widely recognized as a promising solution. The resulted new inter-discipline, edge AI or edge intelligence, is beginning to receive a tremendous amount of interest. However, research on edge intelligence is still in its infancy stage, and a dedicated venue for exchanging the recent advances of edge intelligence is highly desired by both the computer system and artificial intelligence communities. To this end, we conduct a comprehensive survey of the recent research efforts on edge intelligence. Specifically, we first review the background and motivation for artificial intelligence running at the network edge. We then provide an overview of the overarching architectures, frameworks and emerging key technologies for deep learning model towards training/inference at the network edge. Finally, we discuss future research opportunities on edge intelligence. We believe that this survey will elicit escalating attentions, stimulate fruitful discussions and inspire further research ideas on edge intelligence.Comment: Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang, "Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing," Proceedings of the IEE

    Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools

    Full text link
    Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-art results in various domains such as image recognition and natural language processing. One of the reasons for this success is the increasing size of DL models and the proliferation of vast amounts of training data being available. To keep on improving the performance of DL, increasing the scalability of DL systems is necessary. In this survey, we perform a broad and thorough investigation on challenges, techniques and tools for scalable DL on distributed infrastructures. This incorporates infrastructures for DL, methods for parallel DL training, multi-tenant resource scheduling and the management of training and model data. Further, we analyze and compare 11 current open-source DL frameworks and tools and investigate which of the techniques are commonly implemented in practice. Finally, we highlight future research trends in DL systems that deserve further research.Comment: accepted at ACM Computing Surveys, to appea

    Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

    Full text link
    Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands. Classical distributed approaches, synchronous or asynchronous, are based on the parameter server architecture, i.e., worker nodes compute gradients which are communicated to the parameter server while updated parameters are returned. Recently, distributed training with AllReduce operations gained popularity as well. While many of those operations seem appealing, little is reported about wall-clock training time improvements. In this paper, we carefully analyze the AllReduce based setup, propose timing models which include network latency, bandwidth, cluster size and compute time, and demonstrate that a pipelined training with a width of two combines the best of both synchronous and asynchronous training. Specifically, for a setup consisting of a four-node GPU cluster we show wall-clock time training improvements of up to 5.4x compared to conventional approaches.Comment: Accepted at NeurIPS 201

    Knowledge Transferring via Model Aggregation for Online Social Care

    Full text link
    The Internet and the Web are being increasingly used in proactive social care to provide people, especially the vulnerable, with a better life and services, and their derived social services generate enormous data. However, the strict protection of privacy makes user's data become an isolated island and limits the predictive performance of standalone clients. To enable effective proactive social care and knowledge sharing within intelligent agents, this paper develops a knowledge transferring framework via model aggregation. Under this framework, distributed clients perform on-device training, and a third-party server integrates multiple clients' models and redistributes to clients for knowledge transferring among users. To improve the generalizability of the knowledge sharing, we further propose a novel model aggregation algorithm, namely the average difference descent aggregation (AvgDiffAgg for short). In particular, to evaluate the effectiveness of the learning algorithm, we use a case study on the early detection and prevention of suicidal ideation, and the experiment results on four datasets derived from social communities demonstrate the effectiveness of the proposed learning method

    An AI Based Super Nodes Selection Algorithm in BlockChain Networks

    Full text link
    In blockchain systems, especially cryptographic currencies such as Bitcoin, the double-spending and Byzantine-general-like problem are solved by reaching consensus protocols among all nodes. The state-of-the-art protocols include Proof-of-Work, Proof-of-Stake and Delegated-Proof-of-Stake. Proof-of-Work urges nodes to prove their computing power measured in hash rate in a crypto-puzzle solving competition. The other two take into account the amount of stake of each nodes and even design a vote in Delegated-Proof-of-Stake. However, these frameworks have several drawbacks, such as consuming a large number of electricity, leading the whole blockchain to a centralized system and so on. In this paper, we propose the conceptual framework, fundamental theory and research methodology, based on artificial intelligence technology that exploits nearly complementary information of each nodes. And we designed a particular convolutional neural network and a dynamic threshold, which obtained the super nodes and the random nodes, to reach the consensus. Experimental results demonstrate that our framework combines the advantages of Proof-of-Work, Proof-of-Stake and Delegated-Proof-of-Stake by avoiding complicated hash operation and monopoly. Furthermore, it compares favorably to the three state-of-the-art consensus frameworks, in terms of security and the speed of transaction confirmation

    Learning Vision-based Cohesive Flight in Drone Swarms

    Full text link
    This paper presents a data-driven approach to learning vision-based collective behavior from a simple flocking algorithm. We simulate a swarm of quadrotor drones and formulate the controller as a regression problem in which we generate 3D velocity commands directly from raw camera images. The dataset is created by simultaneously acquiring omnidirectional images and computing the corresponding control command from the flocking algorithm. We show that a convolutional neural network trained on the visual inputs of the drone can learn not only robust collision avoidance but also coherence of the flock in a sample-efficient manner. The neural controller effectively learns to localize other agents in the visual input, which we show by visualizing the regions with the most influence on the motion of an agent. This weakly supervised saliency map can be computed efficiently and may be used as a prior for subsequent detection and relative localization of other agents. We remove the dependence on sharing positions among flock members by taking only local visual information into account for control. Our work can therefore be seen as the first step towards a fully decentralized, vision-based flock without the need for communication or visual markers to aid detection of other agents

    Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios

    Full text link
    In this paper, we present a decentralized sensor-level collision avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent's steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy gradient based reinforcement learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy's robustness and effectiveness. We validate the learned sensor-level collision avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller's robustness against the sim-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution to the safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. Videos are available at https://sites.google.com/view/hybridmrc

    Federated Learning with Cooperating Devices: A Consensus Approach for Massive IoT Networks

    Full text link
    Federated learning (FL) is emerging as a new paradigm to train machine learning models in distributed systems. Rather than sharing, and disclosing, the training dataset with the server, the model parameters (e.g. neural networks weights and biases) are optimized collectively by large populations of interconnected devices, acting as local learners. FL can be applied to power-constrained IoT devices with slow and sporadic connections. In addition, it does not need data to be exported to third parties, preserving privacy. Despite these benefits, a main limit of existing approaches is the centralized optimization which relies on a server for aggregation and fusion of local parameters; this has the drawback of a single point of failure and scaling issues for increasing network size. The paper proposes a fully distributed (or server-less) learning approach: the proposed FL algorithms leverage the cooperation of devices that perform data operations inside the network by iterating local computations and mutual interactions via consensus-based methods. The approach lays the groundwork for integration of FL within 5G and beyond networks characterized by decentralized connectivity and computing, with intelligence distributed over the end-devices. The proposed methodology is verified by experimental datasets collected inside an industrial IoT environment.Comment: This work received support from the CHIST-ERA III Grant RadioSense (Big Data and Process Modelling for the Smart Industry - BDSI). The paper has been accepted for publication in the IEEE Internet of Things Journal. The current arXiv contains an additional Appendix C that describes the database and the Python scripts. Published version: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8950073&isnumber=670252

    Efficient Decentralized Deep Learning by Dynamic Model Averaging

    Full text link
    We propose an efficient protocol for decentralized training of deep neural networks from distributed data sources. The proposed protocol allows to handle different phases of model training equally well and to quickly adapt to concept drifts. This leads to a reduction of communication by an order of magnitude compared to periodically communicating state-of-the-art approaches. Moreover, we derive a communication bound that scales well with the hardness of the serialized learning problem. The reduction in communication comes at almost no cost, as the predictive performance remains virtually unchanged. Indeed, the proposed protocol retains loss bounds of periodically averaging schemes. An extensive empirical evaluation validates major improvement of the trade-off between model performance and communication which could be beneficial for numerous decentralized learning applications, such as autonomous driving, or voice recognition and image classification on mobile phones

    Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition

    Full text link
    We propose a novel decentralized feature extraction approach in federated learning to address privacy-preservation issues for speech recognition. It is built upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction, and a recurrent neural network (RNN) based end-to-end acoustic model (AM). To enhance model parameter protection in a decentralized architecture, an input speech is first up-streamed to a quantum computing server to extract Mel-spectrogram, and the corresponding convolutional features are encoded using a quantum circuit algorithm with random parameters. The encoded features are then down-streamed to the local RNN model for the final recognition. The proposed decentralized framework takes advantage of the quantum learning progress to secure models and to avoid privacy leakage attacks. Testing on the Google Speech Commands Dataset, the proposed QCNN encoder attains a competitive accuracy of 95.12% in a decentralized model, which is better than the previous architectures using centralized RNN models with convolutional features. We also conduct an in-depth study of different quantum circuit encoder architectures to provide insights into designing QCNN-based feature extractors. Neural saliency analyses demonstrate a correlation between the proposed QCNN features, class activation maps, and input spectrograms. We provide an implementation for future studies.Comment: Accepted to IEEE ICASSP 2021. Code is available: https://github.com/huckiyang/QuantumSpeech-QCN
    corecore