4 research outputs found
A Generic Network Compression Framework for Sequential Recommender Systems
Sequential recommender systems (SRS) have become the key technology in
capturing user's dynamic interests and generating high-quality recommendations.
Current state-of-the-art sequential recommender models are typically based on a
sandwich-structured deep neural network, where one or more middle (hidden)
layers are placed between the input embedding layer and output softmax layer.
In general, these models require a large number of parameters (such as using a
large embedding dimension or a deep network architecture) to obtain their
optimal performance. Despite the effectiveness, at some point, further
increasing model size may be harder for model deployment in resource-constraint
devices, resulting in longer responding time and larger memory footprint. To
resolve the issues, we propose a compressed sequential recommendation
framework, termed as CpRec, where two generic model shrinking techniques are
employed. Specifically, we first propose a block-wise adaptive decomposition to
approximate the input and softmax matrices by exploiting the fact that items in
SRS obey a long-tailed distribution. To reduce the parameters of the middle
layers, we introduce three layer-wise parameter sharing schemes. We instantiate
CpRec using deep convolutional neural network with dilated kernels given
consideration to both recommendation accuracy and efficiency. By the extensive
ablation studies, we demonstrate that the proposed CpRec can achieve up to
48 times compression rates in real-world SRS datasets. Meanwhile, CpRec
is faster during training\inference, and in most cases outperforms its
uncompressed counterpart.Comment: Accepted by SIGIR202
Device-Cloud Collaborative Learning for Recommendation
With the rapid development of storage and computing power on mobile devices,
it becomes critical and popular to deploy models on devices to save onerous
communication latencies and to capture real-time features. While quite a lot of
works have explored to facilitate on-device learning and inference, most of
them focus on dealing with response delay or privacy protection. Little has
been done to model the collaboration between the device and the cloud modeling
and benefit both sides jointly. To bridge this gap, we are among the first
attempts to study the Device-Cloud Collaborative Learning (DCCL) framework.
Specifically, we propose a novel MetaPatch learning approach on the device side
to efficiently achieve "thousands of people with thousands of models" given a
centralized cloud model. Then, with billions of updated personalized device
models, we propose a "model-over-models" distillation algorithm, namely
MoMoDistill, to update the centralized cloud model. Our extensive experiments
over a range of datasets with different settings demonstrate the effectiveness
of such collaboration on both cloud and devices, especially its superiority to
model long-tailed users.Comment: A new version will be updated soo
One Person, One Model, One World: Learning Continual User Representation without Forgetting
Learning generic user representations which can then be applied to other
user-related tasks (e.g., profile prediction and recommendation) has recently
attracted much attention. Existing approaches often derive an individual set of
model parameters for each task by training their own data. However, the
representation of a user usually has some potential commonalities. As such,
these separately trained representations could be suboptimal in performance as
well as inefficient in terms of parameter sharing. In this paper, we delve on
the research to continually learn user representations task by task, whereby
new tasks are learned while using parameters from old ones. A new problem
arises since when new tasks are trained, previously learned parameters are very
likely to be modified, and thus, an artificial neural network (ANN)-based model
may lose its capacity to serve for well-trained previous tasks forever, termed
as catastrophic forgetting. To address this issue, we present Conure which is
the first continual, or lifelong, user representation learner -- i.e., learning
new tasks over time without forgetting old ones. Specifically, we propose
iteratively removing unimportant weights by pruning on a well-optimized
backbone representation model, enlightened by fact that neural network models
are highly over-parameterized. Then, we are able to learn a coming task by
sharing previous parameters and training new ones only on the empty space after
pruning. We conduct extensive experiments on two real-world datasets across
nine tasks and demonstrate that Conure performs largely better than common
models without purposely preserving such old "knowledge", and is competitive or
sometimes better than models which are trained either individually for each
task or simultaneously by preparing all task data together
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation
Inductive transfer learning has had a big impact on computer vision and NLP
domains but has not been used in the area of recommender systems. Even though
there has been a large body of research on generating recommendations based on
modeling user-item interaction sequences, few of them attempt to represent and
transfer these models for serving downstream tasks where only limited data
exists.
In this paper, we delve on the task of effectively learning a single user
representation that can be applied to a diversity of tasks, from cross-domain
recommendations to user profile predictions. Fine-tuning a large pre-trained
network and adapting it to downstream tasks is an effective way to solve such
tasks. However, fine-tuning is parameter inefficient considering that an entire
model needs to be re-trained for every new task. To overcome this issue, we
develop a parameter efficient transfer learning architecture, termed as
PeterRec, which can be configured on-the-fly to various downstream tasks.
Specifically, PeterRec allows the pre-trained parameters to remain unaltered
during fine-tuning by injecting a series of re-learned neural networks, which
are small but as expressive as learning the entire network. We perform
extensive experimental ablation to show the effectiveness of the learned user
representation in five downstream tasks. Moreover, we show that PeterRec
performs efficient transfer learning in multiple domains, where it achieves
comparable or sometimes better performance relative to fine-tuning the entire
model parameters. Codes and datasets are available at
https://github.com/fajieyuan/sigir2020_peterrec