2 research outputs found
Collaborative Distillation for Top-N Recommendation
Knowledge distillation (KD) is a well-known method to reduce inference
latency by compressing a cumbersome teacher model to a small student model.
Despite the success of KD in the classification task, applying KD to
recommender models is challenging due to the sparsity of positive feedback, the
ambiguity of missing feedback, and the ranking problem associated with the
top-N recommendation. To address the issues, we propose a new KD model for the
collaborative filtering approach, namely collaborative distillation (CD).
Specifically, (1) we reformulate a loss function to deal with the ambiguity of
missing feedback. (2) We exploit probabilistic rank-aware sampling for the
top-N recommendation. (3) To train the proposed model effectively, we develop
two training strategies for the student model, called the teacher- and the
student-guided training methods, selecting the most useful feedback from the
teacher model. Via experimental results, we demonstrate that the proposed model
outperforms the state-of-the-art method by 2.7-33.2% and 2.7-29.1% in hit rate
(HR) and normalized discounted cumulative gain (NDCG), respectively. Moreover,
the proposed model achieves the performance comparable to the teacher model.Comment: 10 pages, ICDM 201
On Estimating the Training Cost of Conversational Recommendation Systems
Conversational recommendation systems have recently gain a lot of attention,
as users can continuously interact with the system over multiple conversational
turns. However, conversational recommendation systems are based on complex
neural architectures, thus the training cost of such models is high. To shed
light on the high computational training time of state-of-the art
conversational models, we examine five representative strategies and
demonstrate this issue. Furthermore, we discuss possible ways to cope with the
high training cost following knowledge distillation strategies, where we detail
the key challenges to reduce the online inference time of the high number of
model parameters in conversational recommendation system