50 research outputs found
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
Influenced by the great success of deep learning via cloud computing and the
rapid development of edge chips, research in artificial intelligence (AI) has
shifted to both of the computing paradigms, i.e., cloud computing and edge
computing. In recent years, we have witnessed significant progress in
developing more advanced AI models on cloud servers that surpass traditional
deep learning models owing to model innovations (e.g., Transformers, Pretrained
families), explosion of training data and soaring computing capabilities.
However, edge computing, especially edge and cloud collaborative computing, are
still in its infancy to announce their success due to the resource-constrained
IoT scenarios with very limited algorithms deployed. In this survey, we conduct
a systematic review for both cloud and edge AI. Specifically, we are the first
to set up the collaborative learning mechanism for cloud and edge modeling with
a thorough review of the architectures that enable such mechanism. We also
discuss potentials and practical experiences of some on-going advanced edge AI
topics including pretraining models, graph neural networks and reinforcement
learning. Finally, we discuss the promising directions and challenges in this
field.Comment: 20 pages, Transactions on Knowledge and Data Engineerin
Dual Correction Strategy for Ranking Distillation in Top-N Recommender System
Knowledge Distillation (KD), which transfers the knowledge of a well-trained
large model (teacher) to a small model (student), has become an important area
of research for practical deployment of recommender systems. Recently, Relaxed
Ranking Distillation (RRD) has shown that distilling the ranking information in
the recommendation list significantly improves the performance. However, the
method still has limitations in that 1) it does not fully utilize the
prediction errors of the student model, which makes the training not fully
efficient, and 2) it only distills the user-side ranking information, which
provides an insufficient view under the sparse implicit feedback. This paper
presents Dual Correction strategy for Distillation (DCD), which transfers the
ranking information from the teacher model to the student model in a more
efficient manner. Most importantly, DCD uses the discrepancy between the
teacher model and the student model predictions to decide which knowledge to be
distilled. By doing so, DCD essentially provides the learning guidance tailored
to "correcting" what the student model has failed to accurately predict. This
process is applied for transferring the ranking information from the user-side
as well as the item-side to address sparse implicit user feedback. Our
experiments show that the proposed method outperforms the state-of-the-art
baselines, and ablation studies validate the effectiveness of each component
Hierarchical Information and Data Modeling for Neural-based Recommender Systems
In the era of information flooding, efficient information retrieval has become a non-neglectful problem. Recommender system, an information filtering system that provides customized suggestions of items that are most likely to be of interest to users has been applied to many customer-oriented services. Recently, more and more neural-based and graph-based models have been studied and adapted in recommender systems, owing to their superiority in dealing with fundamental machine learning problems. However, most existing approaches merely focus on accuracy improvements and ignore that higher recommendation accuracy does not directly imply better recommendations. This thesis aims to propose novel methods for recommender systems that enable higher recommendation accuracy and higher recommendation satisfaction simultaneously. In this thesis, three approaches are proposed from two perspectives, which are (1) generating more personalized individual embeddings and (2) reducing inference latency.
To improve the embedding learning process, the valuable information stored in pairwise preference differences and the hierarchical structures exhibited in user (item) latent relationships are explored and investigated. Firstly, a novel and straightforward pointwise training strategy, named Difference Embedding (DifE), is proposed to capture the valuable information retained in pairwise preference differences. More specifically, by using a novel projection on the designed pairwise differences function, the final derived pointwise loss function allows recommendation models to encode valuable personalized information and achieve more customized predictions. Moreover, a U-shaped graph convolutional network-based recommender system, named UGCN, is proposed to explore the implicit and inherent hierarchies of the user or item. Concretely, with the hierarchical encoding-decoding process, the UGCN model is able to capture user-item relationships at various resolution scales and would finally result in better preference customization.
To reduce inference latency, two knowledge distillation methods are also proposed in the model construction and training process. By training with the valuable information distilled from the sophisticated teacher recommenders, the compact student model can achieve extraordinary recommendation performances and light model architecture simultaneously
Context-sensitive graph representation learning
Graph Convolutional Network (GCN) is a powerful emerging deep learning technique for learning graph data. However, there are still some challenges for GCN. For example, the model is shallow; the performance is poor when labelled nodes are severely scarce. In this paper, we propose a Multi-Semantic Aligned Graph Convolutional Network (MSAGCN), which contains two fundamental operations: multi-angle aggregation and semantic alignment, to resolve two challenges simultaneously. The core of MSAGCN is the aggregation of nodes that belong to the same class from three perspectives: nodes, features, and graph structure, and expects the obtained node features to be mapped nearby. Specifically, multi-angle aggregation is applied to extract features from three angles of the labelled nodes, and semantic alignment is utilised to align the semantics in the extracted features to enhance the similar content from different angles. In this way, the problem of over-smoothing and over-fitting for GCN can be alleviated. We perform the node clustering task on three citation datasets, and the experimental results demonstrate that our method outperforms the state-of-the-art (SOTA) baselines
3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching
We tackle the essential task of finding dense visual correspondences between
a pair of images. This is a challenging problem due to various factors such as
poor texture, repetitive patterns, illumination variation, and motion blur in
practical scenarios. In contrast to methods that use dense correspondence
ground-truths as direct supervision for local feature matching training, we
train 3DG-STFM: a multi-modal matching model (Teacher) to enforce the depth
consistency under 3D dense correspondence supervision and transfer the
knowledge to 2D unimodal matching model (Student). Both teacher and student
models consist of two transformer-based matching modules that obtain dense
correspondences in a coarse-to-fine manner. The teacher model guides the
student model to learn RGB-induced depth information for the matching purpose
on both coarse and fine branches. We also evaluate 3DG-STFM on a model
compression task. To the best of our knowledge, 3DG-STFM is the first
student-teacher learning method for the local feature matching task. The
experiments show that our method outperforms state-of-the-art methods on indoor
and outdoor camera pose estimations, and homography estimation problems. Code
is available at: https://github.com/Ryan-prime/3DG-STFM
Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices
Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements.
This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community
Vertical Federated Graph Neural Network for Recommender System
Conventional recommender systems are required to train the recommendation
model using a centralized database. However, due to data privacy concerns, this
is often impractical when multi-parties are involved in recommender system
training. Federated learning appears as an excellent solution to the data
isolation and privacy problem. Recently, Graph neural network (GNN) is becoming
a promising approach for federated recommender systems. However, a key
challenge is to conduct embedding propagation while preserving the privacy of
the graph structure. Few studies have been conducted on the federated GNN-based
recommender system. Our study proposes the first vertical federated GNN-based
recommender system, called VerFedGNN. We design a framework to transmit: (i)
the summation of neighbor embeddings using random projection, and (ii)
gradients of public parameter perturbed by ternary quantization mechanism.
Empirical studies show that VerFedGNN has competitive prediction accuracy with
existing privacy preserving GNN frameworks while enhanced privacy protection
for users' interaction information.Comment: 17 pages, 9 figure