4 research outputs found

    A Generic Network Compression Framework for Sequential Recommender Systems

    Sequential recommender systems (SRS) have become the key technology in capturing user's dynamic interests and generating high-quality recommendations. Current state-of-the-art sequential recommender models are typically based on a sandwich-structured deep neural network, where one or more middle (hidden) layers are placed between the input embedding layer and output softmax layer. In general, these models require a large number of parameters (such as using a large embedding dimension or a deep network architecture) to obtain their optimal performance. Despite the effectiveness, at some point, further increasing model size may be harder for model deployment in resource-constraint devices, resulting in longer responding time and larger memory footprint. To resolve the issues, we propose a compressed sequential recommendation framework, termed as CpRec, where two generic model shrinking techniques are employed. Specifically, we first propose a block-wise adaptive decomposition to approximate the input and softmax matrices by exploiting the fact that items in SRS obey a long-tailed distribution. To reduce the parameters of the middle layers, we introduce three layer-wise parameter sharing schemes. We instantiate CpRec using deep convolutional neural network with dilated kernels given consideration to both recommendation accuracy and efficiency. By the extensive ablation studies, we demonstrate that the proposed CpRec can achieve up to 4∼\sim8 times compression rates in real-world SRS datasets. Meanwhile, CpRec is faster during training\inference, and in most cases outperforms its uncompressed counterpart.Comment: Accepted by SIGIR202

    Device-Cloud Collaborative Learning for Recommendation

    With the rapid development of storage and computing power on mobile devices, it becomes critical and popular to deploy models on devices to save onerous communication latencies and to capture real-time features. While quite a lot of works have explored to facilitate on-device learning and inference, most of them focus on dealing with response delay or privacy protection. Little has been done to model the collaboration between the device and the cloud modeling and benefit both sides jointly. To bridge this gap, we are among the first attempts to study the Device-Cloud Collaborative Learning (DCCL) framework. Specifically, we propose a novel MetaPatch learning approach on the device side to efficiently achieve "thousands of people with thousands of models" given a centralized cloud model. Then, with billions of updated personalized device models, we propose a "model-over-models" distillation algorithm, namely MoMoDistill, to update the centralized cloud model. Our extensive experiments over a range of datasets with different settings demonstrate the effectiveness of such collaboration on both cloud and devices, especially its superiority to model long-tailed users.Comment: A new version will be updated soo

    One Person, One Model, One World: Learning Continual User Representation without Forgetting

    Learning generic user representations which can then be applied to other user-related tasks (e.g., profile prediction and recommendation) has recently attracted much attention. Existing approaches often derive an individual set of model parameters for each task by training their own data. However, the representation of a user usually has some potential commonalities. As such, these separately trained representations could be suboptimal in performance as well as inefficient in terms of parameter sharing. In this paper, we delve on the research to continually learn user representations task by task, whereby new tasks are learned while using parameters from old ones. A new problem arises since when new tasks are trained, previously learned parameters are very likely to be modified, and thus, an artificial neural network (ANN)-based model may lose its capacity to serve for well-trained previous tasks forever, termed as catastrophic forgetting. To address this issue, we present Conure which is the first continual, or lifelong, user representation learner -- i.e., learning new tasks over time without forgetting old ones. Specifically, we propose iteratively removing unimportant weights by pruning on a well-optimized backbone representation model, enlightened by fact that neural network models are highly over-parameterized. Then, we are able to learn a coming task by sharing previous parameters and training new ones only on the empty space after pruning. We conduct extensive experiments on two real-world datasets across nine tasks and demonstrate that Conure performs largely better than common models without purposely preserving such old "knowledge", and is competitive or sometimes better than models which are trained either individually for each task or simultaneously by preparing all task data together

    Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

    Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists. In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Fine-tuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters. Codes and datasets are available at https://github.com/fajieyuan/sigir2020_peterrec