54 research outputs found
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Learning sophisticated feature interactions behind user behaviors is critical
in maximizing CTR for recommender systems. Despite great progress, existing
methods seem to have a strong bias towards low- or high-order interactions, or
require expertise feature engineering. In this paper, we show that it is
possible to derive an end-to-end learning model that emphasizes both low- and
high-order feature interactions. The proposed model, DeepFM, combines the power
of factorization machines for recommendation and deep learning for feature
learning in a new neural network architecture. Compared to the latest Wide \&
Deep model from Google, DeepFM has a shared input to its "wide" and "deep"
parts, with no need of feature engineering besides raw features. Comprehensive
experiments are conducted to demonstrate the effectiveness and efficiency of
DeepFM over the existing models for CTR prediction, on both benchmark data and
commercial data
MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize
novel classes with few examples? Pre-training based methods effectively tackle
the problem by pre-training a feature extractor and then predicting novel
classes via a cosine nearest neighbor classifier with mean-based prototypes.
Nevertheless, due to the data scarcity, the mean-based prototypes are usually
biased. In this paper, we attempt to diminish the prototype bias by regarding
it as a prototype optimization problem. To this end, we propose a novel
meta-learning based prototype optimization framework to rectify prototypes,
\emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the
existing meta-optimizers can also be adapted to our framework, they all
overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient
estimation is also biased on sparse data. To address the issue, we regard the
gradient and its flow as meta-knowledge and then propose a novel Neural
Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes,
called MetaNODE. In this meta-optimizer, we first view the mean-based
prototypes as initial prototypes, and then model the process of prototype
optimization as continuous-time dynamics specified by a Neural ODE. A gradient
flow inference network is carefully designed to learn to estimate the
continuous gradient flow for prototype dynamics. Finally, the optimal
prototypes can be obtained by solving the Neural ODE. Extensive experiments on
miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our
method.Comment: Accepted by AAAI 202
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Online class-incremental continual learning is a specific task of continual
learning. It aims to continuously learn new classes from data stream and the
samples of data stream are seen only once, which suffers from the catastrophic
forgetting issue, i.e., forgetting historical knowledge of old classes.
Existing replay-based methods effectively alleviate this issue by saving and
replaying part of old data in a proxy-based or contrastive-based replay manner.
Although these two replay manners are effective, the former would incline to
new classes due to class imbalance issues, and the latter is unstable and hard
to converge because of the limited number of samples. In this paper, we conduct
a comprehensive analysis of these two replay manners and find that they can be
complementary. Inspired by this finding, we propose a novel replay-based method
called proxy-based contrastive replay (PCR). The key operation is to replace
the contrastive samples of anchors with corresponding proxies in the
contrastive-based way. It alleviates the phenomenon of catastrophic forgetting
by effectively addressing the imbalance issue, as well as keeps a faster
convergence of the model. We conduct extensive experiments on three real-world
benchmark datasets, and empirical results consistently demonstrate the
superiority of PCR over various state-of-the-art methods.Comment: To appear in CVPR 2023. 10 pages, 8 figures and 3 table
UER: A Heuristic Bias Addressing Approach for Online Continual Learning
Online continual learning aims to continuously train neural networks from a
continuous data stream with a single pass-through data. As the most effective
approach, the rehearsal-based methods replay part of previous data. Commonly
used predictors in existing methods tend to generate biased dot-product logits
that prefer to the classes of current data, which is known as a bias issue and
a phenomenon of forgetting. Many approaches have been proposed to overcome the
forgetting problem by correcting the bias; however, they still need to be
improved in online fashion. In this paper, we try to address the bias issue by
a more straightforward and more efficient method. By decomposing the
dot-product logits into an angle factor and a norm factor, we empirically find
that the bias problem mainly occurs in the angle factor, which can be used to
learn novel knowledge as cosine logits. On the contrary, the norm factor
abandoned by existing methods helps remember historical knowledge. Based on
this observation, we intuitively propose to leverage the norm factor to balance
the new and old knowledge for addressing the bias. To this end, we develop a
heuristic approach called unbias experience replay (UER). UER learns current
samples only by the angle factor and further replays previous samples by both
the norm and angle factors. Extensive experiments on three datasets show that
UER achieves superior performance over various state-of-the-art methods. The
code is in https://github.com/FelixHuiweiLin/UER.Comment: 9 pages, 12 figures, ACM MM202
- …