142 research outputs found
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Online class-incremental continual learning is a specific task of continual
learning. It aims to continuously learn new classes from data stream and the
samples of data stream are seen only once, which suffers from the catastrophic
forgetting issue, i.e., forgetting historical knowledge of old classes.
Existing replay-based methods effectively alleviate this issue by saving and
replaying part of old data in a proxy-based or contrastive-based replay manner.
Although these two replay manners are effective, the former would incline to
new classes due to class imbalance issues, and the latter is unstable and hard
to converge because of the limited number of samples. In this paper, we conduct
a comprehensive analysis of these two replay manners and find that they can be
complementary. Inspired by this finding, we propose a novel replay-based method
called proxy-based contrastive replay (PCR). The key operation is to replace
the contrastive samples of anchors with corresponding proxies in the
contrastive-based way. It alleviates the phenomenon of catastrophic forgetting
by effectively addressing the imbalance issue, as well as keeps a faster
convergence of the model. We conduct extensive experiments on three real-world
benchmark datasets, and empirical results consistently demonstrate the
superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table
MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize
novel classes with few examples? Pre-training based methods effectively tackle
the problem by pre-training a feature extractor and then predicting novel
classes via a cosine nearest neighbor classifier with mean-based prototypes.
Nevertheless, due to the data scarcity, the mean-based prototypes are usually
biased. In this paper, we attempt to diminish the prototype bias by regarding
it as a prototype optimization problem. To this end, we propose a novel
meta-learning based prototype optimization framework to rectify prototypes,
\emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the
existing meta-optimizers can also be adapted to our framework, they all
overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient
estimation is also biased on sparse data. To address the issue, we regard the
gradient and its flow as meta-knowledge and then propose a novel Neural
Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes,
called MetaNODE. In this meta-optimizer, we first view the mean-based
prototypes as initial prototypes, and then model the process of prototype
optimization as continuous-time dynamics specified by a Neural ODE. A gradient
flow inference network is carefully designed to learn to estimate the
continuous gradient flow for prototype dynamics. Finally, the optimal
prototypes can be obtained by solving the Neural ODE. Extensive experiments on
miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our
method.Comment: Accepted by AAAI 202
Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code
This paper introduces Tiramisu, a polyhedral framework designed to generate
high performance code for multiple platforms including multicores, GPUs, and
distributed machines. Tiramisu introduces a scheduling language with novel
extensions to explicitly manage the complexities that arise when targeting
these systems. The framework is designed for the areas of image processing,
stencils, linear algebra and deep learning. Tiramisu has two main features: it
relies on a flexible representation based on the polyhedral model and it has a
rich scheduling language allowing fine-grained control of optimizations.
Tiramisu uses a four-level intermediate representation that allows full
separation between the algorithms, loop transformations, data layouts, and
communication. This separation simplifies targeting multiple hardware
architectures with the same algorithm. We evaluate Tiramisu by writing a set of
image processing, deep learning, and linear algebra benchmarks and compare them
with state-of-the-art compilers and hand-tuned libraries. We show that Tiramisu
matches or outperforms existing compilers and libraries on different hardware
architectures, including multicore CPUs, GPUs, and distributed machines.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0041
Magnesia-stabilised zirconia solid electrolyte assisted electrochemical investigation of iron ions in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K
Production of metallic iron through molten oxide electrolysis using inert electrodes is an alternative route for fast ironmaking without CO2 emissions. The fact that many inorganic oxides melt at ultrahigh temperatures (>1500 K) challenges conventional electro-analytical techniques used in aqueous, organic and molten salt electrolytes. However, in order to design a feasible and effective electrolytic process, it is necessary to best understand the electrochemical properties of iron ions in molten oxide electrolytes. In this work, a magnesia-stabilised zirconia (MSZ) tube with a closed end was used to construct an integrated three-electrode cell with the “MSZ | Pt | O2 (air)” assembly functioning as the solid electrolyte, the reference electrode and also the counter electrode. Electrochemical reduction of iron ions was systematically investigated on an iridium (Ir) wire working electrode in the SiO2-CaO-MgO-Al2O3 molten slag at 1723 K by cyclic voltammetry (CV), square wave voltammetry (SWV), chronopotentiometry (CP) and potentiostatic electrolysis (PE). The results show that the electro-reduction of the Fe2+ ion to Fe on the Ir electrode in the molten slag follows a single two-electron transfer step, and the rate of the process is diffusion controlled. The peak current on the obtained CVs is proportional to the concentration of the Fe2+ ion in the molten slag and the square root of scan rate. The diffusion coefficient of Fe2+ ions in the molten slag containing 5 wt% FeO at 1723 K was derived to be (3.43 ± 0.06)×10-6 cm2 s-1 from CP analysis. However, a couple of following processes, i.e. alloy formation on the Ir electrode surface and interdiffusion were found to affect the kinetics of iron deposition. An ECC mechanism is proposed to account for the CV observations. The findings from this work confirm that zirconia-based solid electrolytes can play an important role in electrochemical fundamental research in high temperature molten slag electrolytes
- …