23 research outputs found
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Online class-incremental continual learning is a specific task of continual
learning. It aims to continuously learn new classes from data stream and the
samples of data stream are seen only once, which suffers from the catastrophic
forgetting issue, i.e., forgetting historical knowledge of old classes.
Existing replay-based methods effectively alleviate this issue by saving and
replaying part of old data in a proxy-based or contrastive-based replay manner.
Although these two replay manners are effective, the former would incline to
new classes due to class imbalance issues, and the latter is unstable and hard
to converge because of the limited number of samples. In this paper, we conduct
a comprehensive analysis of these two replay manners and find that they can be
complementary. Inspired by this finding, we propose a novel replay-based method
called proxy-based contrastive replay (PCR). The key operation is to replace
the contrastive samples of anchors with corresponding proxies in the
contrastive-based way. It alleviates the phenomenon of catastrophic forgetting
by effectively addressing the imbalance issue, as well as keeps a faster
convergence of the model. We conduct extensive experiments on three real-world
benchmark datasets, and empirical results consistently demonstrate the
superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table
HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning
Online continual learning (OCL) aims to continuously learn new data from a
single pass over the online data stream. It generally suffers from the
catastrophic forgetting issue. Existing replay-based methods effectively
alleviate this issue by replaying part of old data in a proxy-based or
contrastive-based replay manner. In this paper, we conduct a comprehensive
analysis of these two replay manners and find they can be complementary.
Inspired by this finding, we propose a novel replay-based method called
proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs
with anchor-to-proxy pairs in the contrastive-based loss to alleviate the
phenomenon of forgetting. Based on PCR, we further develop a more advanced
method named holistic proxy-based contrastive replay (HPCR), which consists of
three components. The contrastive component conditionally incorporates
anchor-to-sample pairs to PCR, learning more fine-grained semantic information
with a large training batch. The second is a temperature component that
decouples the temperature coefficient into two parts based on their impacts on
the gradient and sets different values for them to learn more novel knowledge.
The third is a distillation component that constrains the learning process to
keep more historical knowledge. Experiments on four datasets consistently
demonstrate the superiority of HPCR over various state-of-the-art methods.Comment: 18 pages, 11 figure
UER: A Heuristic Bias Addressing Approach for Online Continual Learning
Online continual learning aims to continuously train neural networks from a
continuous data stream with a single pass-through data. As the most effective
approach, the rehearsal-based methods replay part of previous data. Commonly
used predictors in existing methods tend to generate biased dot-product logits
that prefer to the classes of current data, which is known as a bias issue and
a phenomenon of forgetting. Many approaches have been proposed to overcome the
forgetting problem by correcting the bias; however, they still need to be
improved in online fashion. In this paper, we try to address the bias issue by
a more straightforward and more efficient method. By decomposing the
dot-product logits into an angle factor and a norm factor, we empirically find
that the bias problem mainly occurs in the angle factor, which can be used to
learn novel knowledge as cosine logits. On the contrary, the norm factor
abandoned by existing methods helps remember historical knowledge. Based on
this observation, we intuitively propose to leverage the norm factor to balance
the new and old knowledge for addressing the bias. To this end, we develop a
heuristic approach called unbias experience replay (UER). UER learns current
samples only by the angle factor and further replays previous samples by both
the norm and angle factors. Extensive experiments on three datasets show that
UER achieves superior performance over various state-of-the-art methods. The
code is in https://github.com/FelixHuiweiLin/UER.Comment: 9 pages, 12 figures, ACM MM202
MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
Equipping a deep model the abaility of few-shot learning, i.e., learning
quickly from only few examples, is a core challenge for artificial
intelligence. Gradient-based meta-learning approaches effectively address the
challenge by learning how to learn novel tasks. Its key idea is learning a deep
model in a bi-level optimization manner, where the outer-loop process learns a
shared gradient descent algorithm (i.e., its hyperparameters), while the
inner-loop process leverage it to optimize a task-specific model by using only
few labeled data. Although these existing methods have shown superior
performance, the outer-loop process requires calculating second-order
derivatives along the inner optimization path, which imposes considerable
memory burdens and the risk of vanishing gradients. Drawing inspiration from
recent progress of diffusion models, we find that the inner-loop gradient
descent process can be actually viewed as a reverse process (i.e., denoising)
of diffusion where the target of denoising is model weights but the origin
data. Based on this fact, in this paper, we propose to model the gradient
descent optimizer as a diffusion model and then present a novel
task-conditional diffusion-based meta-learning, called MetaDiff, that
effectively models the optimization process of model weights from Gaussion
noises to target weights in a denoising manner. Thanks to the training
efficiency of diffusion models, our MetaDiff do not need to differentiate
through the inner-loop path such that the memory burdens and the risk of
vanishing gradients can be effectvely alleviated. Experiment results show that
our MetaDiff outperforms the state-of-the-art gradient-based meta-learning
family in few-shot learning tasks.Comment: Accepted by AAAI 202
Combining Context and Knowledge Representations for Chemical-Disease Relation Extraction
Automatically extracting the relationships between chemicals and diseases is
significantly important to various areas of biomedical research and health
care. Biomedical experts have built many large-scale knowledge bases (KBs) to
advance the development of biomedical research. KBs contain huge amounts of
structured information about entities and relationships, therefore plays a
pivotal role in chemical-disease relation (CDR) extraction. However, previous
researches pay less attention to the prior knowledge existing in KBs. This
paper proposes a neural network-based attention model (NAM) for CDR extraction,
which makes full use of context information in documents and prior knowledge in
KBs. For a pair of entities in a document, an attention mechanism is employed
to select important context words with respect to the relation representations
learned from KBs. Experiments on the BioCreative V CDR dataset show that
combining context and knowledge representations through the attention
mechanism, could significantly improve the CDR extraction performance while
achieve comparable results with state-of-the-art systems.Comment: Published on IEEE/ACM Transactions on Computational Biology and
Bioinformatics, 11 pages, 5 figure
Metabolic network as an objective biomarker in monitoring deep brain stimulation for Parkinson's disease: a longitudinal study.
BACKGROUND
With the advance of subthalamic nucleus (STN) deep brain stimulation (DBS) in the treatment of Parkinson's disease (PD), it is desired to identify objective criteria for the monitoring of the therapy outcome. This paper explores the feasibility of metabolic network derived from positron emission tomography (PET) with 18F-fluorodeoxyglucose in monitoring the STN DBS treatment for PD.
METHODS
Age-matched 33 PD patients, 33 healthy controls (HCs), 9 PD patients with bilateral DBS surgery and 9 controls underwent 18F-FDG PET scans. The DBS patients were followed longitudinally to investigate the alternations of the PD-related metabolic covariance pattern (PDRP) expressions.
RESULTS
The PDRP expression was abnormally elevated in PD patients compared with HCs (P < 0.001). For DBS patients, a significant decrease in the Unified Parkinson's Disease Rating Scale (UPDRS, P = 0.001) and PDRP expression (P = 0.004) was observed 3 months after STN DBS treatment, while a rollback was observed in both UPDRS and PDRP expressions (both P < 0.01) 12 months after treatment. The changes in PDRP expression mediated by STN DBS were generally in line with UPDRS improvement. The graphical network analysis shows increased connections at 3 months and a return at 12 months confirmed by small-worldness coefficient.
CONCLUSIONS
The preliminary results demonstrate the potential of metabolic network expression as complimentary objective biomarker for the assessment and monitoring of STN DBS treatment in PD patients. Clinical Trial Registration ChiCTR-DOC-16008645. http://www.chictr.org.cn/showproj.aspx?proj=13865
A Survey on Deep Learning-Based Change Detection from High-Resolution Remote Sensing Images
Change detection based on remote sensing images plays an important role in the field of remote sensing analysis, and it has been widely used in many areas, such as resources monitoring, urban planning, disaster assessment, etc. In recent years, it has aroused widespread interest due to the explosive development of artificial intelligence (AI) technology, and change detection algorithms based on deep learning frameworks have made it possible to detect more delicate changes (such as the alteration of small buildings) with the help of huge amounts of remote sensing data, especially high-resolution (HR) data. Although there are many methods, we still lack a deep review of the recent progress concerning the latest deep learning methods in change detection. To this end, the main purpose of this paper is to provide a review of the available deep learning-based change detection algorithms using HR remote sensing images. The paper first describes the change detection framework and classifies the methods from the perspective of the deep network architectures adopted. Then, we review the latest progress in the application of deep learning in various granularity structures for change detection. Further, the paper provides a summary of HR datasets derived from different sensors, along with information related to change detection, for the potential use of researchers. Simultaneously, representative evaluation metrics for this task are investigated. Finally, a conclusion of the challenges for change detection using HR remote sensing images, which must be dealt with in order to improve the model’s performance, is presented. In addition, we put forward promising directions for future research in this area