Search CORE

82 research outputs found

Relabeling Minimal Training Subset to Flip a Prediction

Author: Xu Linjie
Yang Jinghan
Yu Lequan
Publication venue
Publication date: 28/06/2023
Field of study

When facing an unsatisfactory prediction from a machine learning model, it is crucial to investigate the underlying reasons and explore the potential for reversing the outcome. We ask: can we result in the flipping of a test prediction

x_t

by relabeling the smallest subset

\mathcal{S}_t

of the training data before the model is trained? We propose an efficient procedure to identify and relabel such a subset via an extended influence function. We find that relabeling fewer than 1% of the training points can often flip the model's prediction. This mechanism can serve multiple purposes: (1) providing an approach to challenge a model prediction by recovering influential training subsets; (2) evaluating model robustness with the cardinality of the subset (i.e.,

|\mathcal{S}_t|

); we show that

|\mathcal{S}_t|

is highly related to the noise ratio in the training set and

|\mathcal{S}_t|

is correlated with but complementary to predicted probabilities; (3) revealing training points lead to group attribution bias. To the best of our knowledge, we are the first to investigate identifying and relabeling the minimal training subset required to flip a given prediction.Comment: Under revie

arXiv.org e-Print Archive

Small-Signal Stability Analysis and Optimal Parameters Design of Microgrid Clusters

Author: Guerrero Josep M.
He Jinghan
Wu Xiangyu
Wu Xiaoyu
Xu Yin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

VBN

The SpeakIn System Description for CNSRC2022

Author: Chen Yihao
Liu Min
Peng Jinghan
Xu Minqiang
Zhang Yajun
Zheng Yu
Publication venue
Publication date: 22/09/2022
Field of study

This report describes our speaker verification systems for the tasks of the CN-Celeb Speaker Recognition Challenge 2022 (CNSRC 2022). This challenge includes two tasks, namely speaker verification(SV) and speaker retrieval(SR). The SV task involves two tracks: fixed track and open track. In the fixed track, we only used CN-Celeb.T as the training set. For the open track of the SV task and SR task, we added our open-source audio data. The ResNet-based, RepVGG-based, and TDNN-based architectures were developed for this challenge. Global statistic pooling structure and MQMHA pooling structure were used to aggregate the frame-level features across time to obtain utterance-level representation. We adopted AM-Softmax and AAM-Softmax combined with the Sub-Center method to classify the resulting embeddings. We also used the Large-Margin Fine-Tuning strategy to further improve the model performance. In the backend, Sub-Mean and AS-Norm were used. In the SV task fixed track, our system was a fusion of five models, and two models were fused in the SV task open track. And we used a single system in the SR task. Our approach leads to superior performance and comes the 1st place in the open track of the SV task, the 2nd place in the fixed track of the SV task, and the 3rd place in the SR task.Comment: 4 page

arXiv.org e-Print Archive

You've Got Two Teachers: Co-evolutionary Image and Report Distillation for Semi-supervised Anatomical Abnormality Detection in Chest X-ray

Author: Liu Hong
Lu Donghuan
Sun Jinghan
Wang Liansheng
Wei Dong
Xu Zhe
Zheng Yefeng
Publication venue
Publication date: 18/07/2023
Field of study

Chest X-ray (CXR) anatomical abnormality detection aims at localizing and characterising cardiopulmonary radiological findings in the radiographs, which can expedite clinical workflow and reduce observational oversights. Most existing methods attempted this task in either fully supervised settings which demanded costly mass per-abnormality annotations, or weakly supervised settings which still lagged badly behind fully supervised methods in performance. In this work, we propose a co-evolutionary image and report distillation (CEIRD) framework, which approaches semi-supervised abnormality detection in CXR by grounding the visual detection results with text-classified abnormalities from paired radiology reports, and vice versa. Concretely, based on the classical teacher-student pseudo label distillation (TSD) paradigm, we additionally introduce an auxiliary report classification model, whose prediction is used for report-guided pseudo detection label refinement (RPDLR) in the primary vision detection task. Inversely, we also use the prediction of the vision detection model for abnormality-guided pseudo classification label refinement (APCLR) in the auxiliary report classification task, and propose a co-evolution strategy where the vision and report models mutually promote each other with RPDLR and APCLR performed alternatively. To this end, we effectively incorporate the weak supervision by reports into the semi-supervised TSD pipeline. Besides the cross-modal pseudo label refinement, we further propose an intra-image-modal self-adaptive non-maximum suppression, where the pseudo detection labels generated by the teacher vision model are dynamically rectified by high-confidence predictions by the student. Experimental results on the public MIMIC-CXR benchmark demonstrate CEIRD's superior performance to several up-to-date weakly and semi-supervised methods

arXiv.org e-Print Archive

Genetic Meta-Structure Search for Recommendation on Heterogeneous Information Network

Author: Han Zhenyu
Hui Pan
Li Yong
Ma Haorui
Shang Yu
Shi Jinghan
Xu Fengli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/02/2021
Field of study

In the past decade, the heterogeneous information network (HIN) has become an important methodology for modern recommender systems. To fully leverage its power, manually designed network templates, i.e., meta-structures, are introduced to filter out semantic-aware information. The hand-crafted meta-structure rely on intense expert knowledge, which is both laborious and data-dependent. On the other hand, the number of meta-structures grows exponentially with its size and the number of node types, which prohibits brute-force search. To address these challenges, we propose Genetic Meta-Structure Search (GEMS) to automatically optimize meta-structure designs for recommendation on HINs. Specifically, GEMS adopts a parallel genetic algorithm to search meaningful meta-structures for recommendation, and designs dedicated rules and a meta-structure predictor to efficiently explore the search space. Finally, we propose an attention based multi-view graph convolutional network module to dynamically fuse information from different meta-structures. Extensive experiments on three real-world datasets suggest the effectiveness of GEMS, which consistently outperforms all baseline methods in HIN recommendation. Compared with simplified GEMS which utilizes hand-crafted meta-paths, GEMS achieves over

6\%

performance gain on most evaluation metrics. More importantly, we conduct an in-depth analysis on the identified meta-structures, which sheds light on the HIN based recommender system design.Comment: Published in Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20

arXiv.org e-Print Archive

Crossref

An interpretable imbalanced semi-supervised deep learning framework for improving differential diagnosis of skin diseases

Author: Li Qiyuan
Ma Yuanting
Shan Shijun
Sun Jinghan
Wang Yang
Weng Futian
Xu Yan
Zhu Jianping
Publication venue
Publication date: 19/11/2022
Field of study

Dermatological diseases are among the most common disorders worldwide. This paper presents the first study of the interpretability and imbalanced semi-supervised learning of the multiclass intelligent skin diagnosis framework (ISDL) using 58,457 skin images with 10,857 unlabeled samples. Pseudo-labelled samples from minority classes have a higher probability at each iteration of class-rebalancing self-training, thereby promoting the utilization of unlabeled samples to solve the class imbalance problem. Our ISDL achieved a promising performance with an accuracy of 0.979, sensitivity of 0.975, specificity of 0.973, macro-F1 score of 0.974 and area under the receiver operating characteristic curve (AUC) of 0.999 for multi-label skin disease classification. The Shapley Additive explanation (SHAP) method is combined with our ISDL to explain how the deep learning model makes predictions. This finding is consistent with the clinical diagnosis. We also proposed a sampling distribution optimisation strategy to select pseudo-labelled samples in a more effective manner using ISDLplus. Furthermore, it has the potential to relieve the pressure placed on professional doctors, as well as help with practical issues associated with a shortage of such doctors in rural areas

arXiv.org e-Print Archive

A Two-layer Distributed Cooperative Control Method for Islanded Networked Microgrid Systems

Author: Guerrero Josep M.
He Jinghan
Liu Chen-Ching
Schneider Kevin P.
Ton Dan T.
Wu Xiangyu
Wu Xiaoyu
Xu Yin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2020
Field of study

VBN