Search CORE

23 research outputs found

PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning

Author: Feng Shanshan
Li Xutao
Lin Huiwei
Ye Yunming
Zhang Baoquan
Publication venue
Publication date: 10/04/2023
Field of study

Online class-incremental continual learning is a specific task of continual learning. It aims to continuously learn new classes from data stream and the samples of data stream are seen only once, which suffers from the catastrophic forgetting issue, i.e., forgetting historical knowledge of old classes. Existing replay-based methods effectively alleviate this issue by saving and replaying part of old data in a proxy-based or contrastive-based replay manner. Although these two replay manners are effective, the former would incline to new classes due to class imbalance issues, and the latter is unstable and hard to converge because of the limited number of samples. In this paper, we conduct a comprehensive analysis of these two replay manners and find that they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR). The key operation is to replace the contrastive samples of anchors with corresponding proxies in the contrastive-based way. It alleviates the phenomenon of catastrophic forgetting by effectively addressing the imbalance issue, as well as keeps a faster convergence of the model. We conduct extensive experiments on three real-world benchmark datasets, and empirical results consistently demonstrate the superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table

arXiv.org e-Print Archive

HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning

Author: Feng Shanshan
Li Xutao
Lin Huiwei
Ong Yew-soon
Ye Yunming
Zhang Baoquan
Publication venue
Publication date: 26/09/2023
Field of study

Online continual learning (OCL) aims to continuously learn new data from a single pass over the online data stream. It generally suffers from the catastrophic forgetting issue. Existing replay-based methods effectively alleviate this issue by replaying part of old data in a proxy-based or contrastive-based replay manner. In this paper, we conduct a comprehensive analysis of these two replay manners and find they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs with anchor-to-proxy pairs in the contrastive-based loss to alleviate the phenomenon of forgetting. Based on PCR, we further develop a more advanced method named holistic proxy-based contrastive replay (HPCR), which consists of three components. The contrastive component conditionally incorporates anchor-to-sample pairs to PCR, learning more fine-grained semantic information with a large training batch. The second is a temperature component that decouples the temperature coefficient into two parts based on their impacts on the gradient and sets different values for them to learn more novel knowledge. The third is a distillation component that constrains the learning process to keep more historical knowledge. Experiments on four datasets consistently demonstrate the superiority of HPCR over various state-of-the-art methods.Comment: 18 pages, 11 figure

arXiv.org e-Print Archive

UER: A Heuristic Bias Addressing Approach for Online Continual Learning

Author: Feng Shanshan
Li Xutao
Lin Huiwei
Qiao Hongliang
Ye Yunming
Zhang Baoquan
Publication venue
Publication date: 07/09/2023
Field of study

Online continual learning aims to continuously train neural networks from a continuous data stream with a single pass-through data. As the most effective approach, the rehearsal-based methods replay part of previous data. Commonly used predictors in existing methods tend to generate biased dot-product logits that prefer to the classes of current data, which is known as a bias issue and a phenomenon of forgetting. Many approaches have been proposed to overcome the forgetting problem by correcting the bias; however, they still need to be improved in online fashion. In this paper, we try to address the bias issue by a more straightforward and more efficient method. By decomposing the dot-product logits into an angle factor and a norm factor, we empirically find that the bias problem mainly occurs in the angle factor, which can be used to learn novel knowledge as cosine logits. On the contrary, the norm factor abandoned by existing methods helps remember historical knowledge. Based on this observation, we intuitively propose to leverage the norm factor to balance the new and old knowledge for addressing the bias. To this end, we develop a heuristic approach called unbias experience replay (UER). UER learns current samples only by the angle factor and further replays previous samples by both the norm and angle factors. Extensive experiments on three datasets show that UER achieves superior performance over various state-of-the-art methods. The code is in https://github.com/FelixHuiweiLin/UER.Comment: 9 pages, 12 figures, ACM MM202

arXiv.org e-Print Archive

MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

Author: Li Xutao
Lin Huiwei
Luo Chuyao
Ye Yunming
Yu Demin
Zhang Baoquan
Zhang Bowen
Publication venue
Publication date: 08/01/2024
Field of study

Equipping a deep model the abaility of few-shot learning, i.e., learning quickly from only few examples, is a core challenge for artificial intelligence. Gradient-based meta-learning approaches effectively address the challenge by learning how to learn novel tasks. Its key idea is learning a deep model in a bi-level optimization manner, where the outer-loop process learns a shared gradient descent algorithm (i.e., its hyperparameters), while the inner-loop process leverage it to optimize a task-specific model by using only few labeled data. Although these existing methods have shown superior performance, the outer-loop process requires calculating second-order derivatives along the inner optimization path, which imposes considerable memory burdens and the risk of vanishing gradients. Drawing inspiration from recent progress of diffusion models, we find that the inner-loop gradient descent process can be actually viewed as a reverse process (i.e., denoising) of diffusion where the target of denoising is model weights but the origin data. Based on this fact, in this paper, we propose to model the gradient descent optimizer as a diffusion model and then present a novel task-conditional diffusion-based meta-learning, called MetaDiff, that effectively models the optimization process of model weights from Gaussion noises to target weights in a denoising manner. Thanks to the training efficiency of diffusion models, our MetaDiff do not need to differentiate through the inner-loop path such that the memory burdens and the risk of vanishing gradients can be effectvely alleviated. Experiment results show that our MetaDiff outperforms the state-of-the-art gradient-based meta-learning family in few-shot learning tasks.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Combining Context and Knowledge Representations for Chemical-Disease Relation Extraction

Author: Huang Degen
Lang Chengkun
Lin Yingyu
Liu Zhuang
Ning Shixian
Yang Yunlong
Zhou Huiwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Automatically extracting the relationships between chemicals and diseases is significantly important to various areas of biomedical research and health care. Biomedical experts have built many large-scale knowledge bases (KBs) to advance the development of biomedical research. KBs contain huge amounts of structured information about entities and relationships, therefore plays a pivotal role in chemical-disease relation (CDR) extraction. However, previous researches pay less attention to the prior knowledge existing in KBs. This paper proposes a neural network-based attention model (NAM) for CDR extraction, which makes full use of context information in documents and prior knowledge in KBs. For a pair of entities in a document, an attention mechanism is employed to select important context words with respect to the relation representations learned from KBs. Experiments on the BioCreative V CDR dataset show that combining context and knowledge representations through the attention mechanism, could significantly improve the CDR extraction performance while achieve comparable results with state-of-the-art systems.Comment: Published on IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11 pages, 5 figure

arXiv.org e-Print Archive

Crossref

<p>Comparative Study of Cognitive Function Between Treatment-Resistant Depressive Patients and First-Episode Depressive Patients</p>

Author: Dongping Rao
Guiyun Xu
Huiwei Liang
Kanguang Lin
Muni Tang
Zenghong Lu
Publication venue: 'Dove Medical Press Ltd.'
Publication date
Field of study

Crossref

Metabolic network as an objective biomarker in monitoring deep brain stimulation for Parkinson's disease: a longitudinal study.

Author: Ge Jingjie
Guan Yihui
Huang Zhemin
Jiang Jiehui
Lin Wei
Rominger Axel
Shi Kuangyu
Wang Min
Wu Ping
Yang Likun
Zhang Huiwei
Zuo Chuantao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/10/2020
Field of study

BACKGROUND With the advance of subthalamic nucleus (STN) deep brain stimulation (DBS) in the treatment of Parkinson's disease (PD), it is desired to identify objective criteria for the monitoring of the therapy outcome. This paper explores the feasibility of metabolic network derived from positron emission tomography (PET) with 18F-fluorodeoxyglucose in monitoring the STN DBS treatment for PD. METHODS Age-matched 33 PD patients, 33 healthy controls (HCs), 9 PD patients with bilateral DBS surgery and 9 controls underwent 18F-FDG PET scans. The DBS patients were followed longitudinally to investigate the alternations of the PD-related metabolic covariance pattern (PDRP) expressions. RESULTS The PDRP expression was abnormally elevated in PD patients compared with HCs (P < 0.001). For DBS patients, a significant decrease in the Unified Parkinson's Disease Rating Scale (UPDRS, P = 0.001) and PDRP expression (P = 0.004) was observed 3 months after STN DBS treatment, while a rollback was observed in both UPDRS and PDRP expressions (both P < 0.01) 12 months after treatment. The changes in PDRP expression mediated by STN DBS were generally in line with UPDRS improvement. The graphical network analysis shows increased connections at 3 months and a return at 12 months confirmed by small-worldness coefficient. CONCLUSIONS The preliminary results demonstrate the potential of metabolic network expression as complimentary objective biomarker for the assessment and monitoring of STN DBS treatment in PD patients. Clinical Trial Registration ChiCTR-DOC-16008645. http://www.chictr.org.cn/showproj.aspx?proj=13865

Bern Open Repository and Information System (BORIS)

A Survey on Deep Learning-Based Change Detection from High-Resolution Remote Sensing Images

Author: Haofeng Xie
Huiwei Jiang
Jingming Lin
Min Peng
Xiangyun Hu
Xiaoli Ma
Yuanjun Zhong
Zemin Hao
Publication venue: 'MDPI AG'
Publication date: 23/03/2022
Field of study

Change detection based on remote sensing images plays an important role in the field of remote sensing analysis, and it has been widely used in many areas, such as resources monitoring, urban planning, disaster assessment, etc. In recent years, it has aroused widespread interest due to the explosive development of artificial intelligence (AI) technology, and change detection algorithms based on deep learning frameworks have made it possible to detect more delicate changes (such as the alteration of small buildings) with the help of huge amounts of remote sensing data, especially high-resolution (HR) data. Although there are many methods, we still lack a deep review of the recent progress concerning the latest deep learning methods in change detection. To this end, the main purpose of this paper is to provide a review of the available deep learning-based change detection algorithms using HR remote sensing images. The paper first describes the change detection framework and classifies the methods from the perspective of the deep network architectures adopted. Then, we review the latest progress in the application of deep learning in various granularity structures for change detection. Further, the paper provides a summary of HR datasets derived from different sensors, along with information related to change detection, for the potential use of researchers. Simultaneously, representative evaluation metrics for this task are investigated. Finally, a conclusion of the challenges for change detection using HR remote sensing images, which must be dealt with in order to improve the model’s performance, is presented. In addition, we put forward promising directions for future research in this area

Multidisciplinary Digital Publishing Institute