50 research outputs found

    Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI

    Full text link
    Influenced by the great success of deep learning via cloud computing and the rapid development of edge chips, research in artificial intelligence (AI) has shifted to both of the computing paradigms, i.e., cloud computing and edge computing. In recent years, we have witnessed significant progress in developing more advanced AI models on cloud servers that surpass traditional deep learning models owing to model innovations (e.g., Transformers, Pretrained families), explosion of training data and soaring computing capabilities. However, edge computing, especially edge and cloud collaborative computing, are still in its infancy to announce their success due to the resource-constrained IoT scenarios with very limited algorithms deployed. In this survey, we conduct a systematic review for both cloud and edge AI. Specifically, we are the first to set up the collaborative learning mechanism for cloud and edge modeling with a thorough review of the architectures that enable such mechanism. We also discuss potentials and practical experiences of some on-going advanced edge AI topics including pretraining models, graph neural networks and reinforcement learning. Finally, we discuss the promising directions and challenges in this field.Comment: 20 pages, Transactions on Knowledge and Data Engineerin

    Dual Correction Strategy for Ranking Distillation in Top-N Recommender System

    Full text link
    Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems. Recently, Relaxed Ranking Distillation (RRD) has shown that distilling the ranking information in the recommendation list significantly improves the performance. However, the method still has limitations in that 1) it does not fully utilize the prediction errors of the student model, which makes the training not fully efficient, and 2) it only distills the user-side ranking information, which provides an insufficient view under the sparse implicit feedback. This paper presents Dual Correction strategy for Distillation (DCD), which transfers the ranking information from the teacher model to the student model in a more efficient manner. Most importantly, DCD uses the discrepancy between the teacher model and the student model predictions to decide which knowledge to be distilled. By doing so, DCD essentially provides the learning guidance tailored to "correcting" what the student model has failed to accurately predict. This process is applied for transferring the ranking information from the user-side as well as the item-side to address sparse implicit user feedback. Our experiments show that the proposed method outperforms the state-of-the-art baselines, and ablation studies validate the effectiveness of each component

    Hierarchical Information and Data Modeling for Neural-based Recommender Systems

    Full text link
    In the era of information flooding, efficient information retrieval has become a non-neglectful problem. Recommender system, an information filtering system that provides customized suggestions of items that are most likely to be of interest to users has been applied to many customer-oriented services. Recently, more and more neural-based and graph-based models have been studied and adapted in recommender systems, owing to their superiority in dealing with fundamental machine learning problems. However, most existing approaches merely focus on accuracy improvements and ignore that higher recommendation accuracy does not directly imply better recommendations. This thesis aims to propose novel methods for recommender systems that enable higher recommendation accuracy and higher recommendation satisfaction simultaneously. In this thesis, three approaches are proposed from two perspectives, which are (1) generating more personalized individual embeddings and (2) reducing inference latency. To improve the embedding learning process, the valuable information stored in pairwise preference differences and the hierarchical structures exhibited in user (item) latent relationships are explored and investigated. Firstly, a novel and straightforward pointwise training strategy, named Difference Embedding (DifE), is proposed to capture the valuable information retained in pairwise preference differences. More specifically, by using a novel projection on the designed pairwise differences function, the final derived pointwise loss function allows recommendation models to encode valuable personalized information and achieve more customized predictions. Moreover, a U-shaped graph convolutional network-based recommender system, named UGCN, is proposed to explore the implicit and inherent hierarchies of the user or item. Concretely, with the hierarchical encoding-decoding process, the UGCN model is able to capture user-item relationships at various resolution scales and would finally result in better preference customization. To reduce inference latency, two knowledge distillation methods are also proposed in the model construction and training process. By training with the valuable information distilled from the sophisticated teacher recommenders, the compact student model can achieve extraordinary recommendation performances and light model architecture simultaneously

    Context-sensitive graph representation learning

    Get PDF
    Graph Convolutional Network (GCN) is a powerful emerging deep learning technique for learning graph data. However, there are still some challenges for GCN. For example, the model is shallow; the performance is poor when labelled nodes are severely scarce. In this paper, we propose a Multi-Semantic Aligned Graph Convolutional Network (MSAGCN), which contains two fundamental operations: multi-angle aggregation and semantic alignment, to resolve two challenges simultaneously. The core of MSAGCN is the aggregation of nodes that belong to the same class from three perspectives: nodes, features, and graph structure, and expects the obtained node features to be mapped nearby. Specifically, multi-angle aggregation is applied to extract features from three angles of the labelled nodes, and semantic alignment is utilised to align the semantics in the extracted features to enhance the similar content from different angles. In this way, the problem of over-smoothing and over-fitting for GCN can be alleviated. We perform the node clustering task on three citation datasets, and the experimental results demonstrate that our method outperforms the state-of-the-art (SOTA) baselines

    3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching

    Full text link
    We tackle the essential task of finding dense visual correspondences between a pair of images. This is a challenging problem due to various factors such as poor texture, repetitive patterns, illumination variation, and motion blur in practical scenarios. In contrast to methods that use dense correspondence ground-truths as direct supervision for local feature matching training, we train 3DG-STFM: a multi-modal matching model (Teacher) to enforce the depth consistency under 3D dense correspondence supervision and transfer the knowledge to 2D unimodal matching model (Student). Both teacher and student models consist of two transformer-based matching modules that obtain dense correspondences in a coarse-to-fine manner. The teacher model guides the student model to learn RGB-induced depth information for the matching purpose on both coarse and fine branches. We also evaluate 3DG-STFM on a model compression task. To the best of our knowledge, 3DG-STFM is the first student-teacher learning method for the local feature matching task. The experiments show that our method outperforms state-of-the-art methods on indoor and outdoor camera pose estimations, and homography estimation problems. Code is available at: https://github.com/Ryan-prime/3DG-STFM

    Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices

    Get PDF
    Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements. This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community

    Vertical Federated Graph Neural Network for Recommender System

    Full text link
    Conventional recommender systems are required to train the recommendation model using a centralized database. However, due to data privacy concerns, this is often impractical when multi-parties are involved in recommender system training. Federated learning appears as an excellent solution to the data isolation and privacy problem. Recently, Graph neural network (GNN) is becoming a promising approach for federated recommender systems. However, a key challenge is to conduct embedding propagation while preserving the privacy of the graph structure. Few studies have been conducted on the federated GNN-based recommender system. Our study proposes the first vertical federated GNN-based recommender system, called VerFedGNN. We design a framework to transmit: (i) the summation of neighbor embeddings using random projection, and (ii) gradients of public parameter perturbed by ternary quantization mechanism. Empirical studies show that VerFedGNN has competitive prediction accuracy with existing privacy preserving GNN frameworks while enhanced privacy protection for users' interaction information.Comment: 17 pages, 9 figure
    corecore