203 research outputs found
AdaGMLP: AdaBoosting GNN-to-MLP Knowledge Distillation
Graph Neural Networks (GNNs) have revolutionized graph-based machine
learning, but their heavy computational demands pose challenges for
latency-sensitive edge devices in practical industrial applications. In
response, a new wave of methods, collectively known as GNN-to-MLP Knowledge
Distillation, has emerged. They aim to transfer GNN-learned knowledge to a more
efficient MLP student, which offers faster, resource-efficient inference while
maintaining competitive performance compared to GNNs. However, these methods
face significant challenges in situations with insufficient training data and
incomplete test data, limiting their applicability in real-world applications.
To address these challenges, we propose AdaGMLP, an AdaBoosting GNN-to-MLP
Knowledge Distillation framework. It leverages an ensemble of diverse MLP
students trained on different subsets of labeled nodes, addressing the issue of
insufficient training data. Additionally, it incorporates a Node Alignment
technique for robust predictions on test data with missing or incomplete
features. Our experiments on seven benchmark datasets with different settings
demonstrate that AdaGMLP outperforms existing G2M methods, making it suitable
for a wide range of latency-sensitive real-world applications. We have
submitted our code to the GitHub repository
(https://github.com/WeigangLu/AdaGMLP-KDD24).Comment: Accepted by KDD 202
Trusted Multi-view Learning with Label Noise
Multi-view learning methods often focus on improving decision accuracy while
neglecting the decision uncertainty, which significantly restricts their
applications in safety-critical applications. To address this issue,
researchers propose trusted multi-view methods that learn the class
distribution for each instance, enabling the estimation of classification
probabilities and uncertainty. However, these methods heavily rely on
high-quality ground-truth labels. This motivates us to delve into a new
generalized trusted multi-view learning problem: how to develop a reliable
multi-view learning model under the guidance of noisy labels? We propose a
trusted multi-view noise refining method to solve this problem. We first
construct view-opinions using evidential deep neural networks, which consist of
belief mass vectors and uncertainty estimates. Subsequently, we design
view-specific noise correlation matrices that transform the original opinions
into noisy opinions aligned with the noisy labels. Considering label noises
originating from low-quality data features and easily-confused classes, we
ensure that the diagonal elements of these matrices are inversely proportional
to the uncertainty, while incorporating class relations into the off-diagonal
elements. Finally, we aggregate the noisy opinions and employ a generalized
maximum likelihood loss on the aggregated opinion for model training, guided by
the noisy labels. We empirically compare TMNR with state-of-the-art trusted
multi-view learning and label noise learning baselines on 5 publicly available
datasets. Experiment results show that TMNR outperforms baseline methods on
accuracy, reliability and robustness. The code and appendix are released at
https://github.com/YilinZhang107/TMNR.Comment: 12 pages, accepted at IJCAI 202
Enhancing Criminal Case Matching through Diverse Legal Factors
Criminal case matching endeavors to determine the relevance between different
criminal cases. Conventional methods predict the relevance solely based on
instance-level semantic features and neglect the diverse legal factors (LFs),
which are associated with diverse court judgments. Consequently,
comprehensively representing a criminal case remains a challenge for these
approaches. Moreover, extracting and utilizing these LFs for criminal case
matching face two challenges: (1) the manual annotations of LFs rely heavily on
specialized legal knowledge; (2) overlaps among LFs may potentially harm the
model's performance. In this paper, we propose a two-stage framework named
Diverse Legal Factor-enhanced Criminal Case Matching (DLF-CCM). Firstly,
DLF-CCM employs a multi-task learning framework to pre-train an LF extraction
network on a large-scale legal judgment prediction dataset. In stage two,
DLF-CCM introduces an LF de-redundancy module to learn shared LF and exclusive
LFs. Moreover, an entropy-weighted fusion strategy is introduced to dynamically
fuse the multiple relevance generated by all LFs. Experimental results validate
the effectiveness of DLF-CCM and show its significant improvements over
competitive baselines. Code: https://github.com/jiezhao6/DLF-CCM
SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer
Text style transfer (TST) aims to vary the style polarity of text while
preserving the semantic content. Although recent advancements have demonstrated
remarkable progress in short TST, it remains a relatively straightforward task
with limited practical applications. The more comprehensive long TST task
presents two challenges: (1) existing methods encounter difficulties in
accurately evaluating content attributes in multiple words, leading to content
degradation; (2) the conventional vanilla style classifier loss encounters
obstacles in maintaining consistent style across multiple generated sentences.
In this paper, we propose a novel method SC2, where a multilayer Joint
Style-Content Weighed (JSCW) module and a Style Consistency loss are designed
to address the two issues. The JSCW simultaneously assesses the amounts of
style and content attributes within a token, aiming to acquire a lossless
content representation and thereby enhancing content preservation. The multiple
JSCW layers further progressively refine content representations. We design a
style consistency loss to ensure the generated multiple sentences consistently
reflect the target style polarity. Moreover, we incorporate a denoising
non-autoregressive decoder to accelerate the training. We conduct plentiful
experiments and the results show significant improvements of SC2 over
competitive baselines. Our code: https://github.com/jiezhao6/SC2
Interpretable and Robust AI in EEG Systems: A Survey
The close coupling of artificial intelligence (AI) and electroencephalography
(EEG) has substantially advanced human-computer interaction (HCI) technologies
in the AI era. Different from traditional EEG systems, the interpretability and
robustness of AI-based EEG systems are becoming particularly crucial. The
interpretability clarifies the inner working mechanisms of AI models and thus
can gain the trust of users. The robustness reflects the AI's reliability
against attacks and perturbations, which is essential for sensitive and fragile
EEG signals. Thus the interpretability and robustness of AI in EEG systems have
attracted increasing attention, and their research has achieved great progress
recently. However, there is still no survey covering recent advances in this
field. In this paper, we present the first comprehensive survey and summarize
the interpretable and robust AI techniques for EEG systems. Specifically, we
first propose a taxonomy of interpretability by characterizing it into three
types: backpropagation, perturbation, and inherently interpretable methods.
Then we classify the robustness mechanisms into four classes: noise and
artifacts, human variability, data acquisition instability, and adversarial
attacks. Finally, we identify several critical and unresolved challenges for
interpretable and robust AI in EEG systems and further discuss their future
directions
- …
