12 research outputs found
Towards Neural Mixture Recommender for Long Range Dependent User Sequences
Understanding temporal dynamics has proved to be highly valuable for accurate
recommendation. Sequential recommenders have been successful in modeling the
dynamics of users and items over time. However, while different model
architectures excel at capturing various temporal ranges or dynamics, distinct
application contexts require adapting to diverse behaviors. In this paper we
examine how to build a model that can make use of different temporal ranges and
dynamics depending on the request context. We begin with the analysis of an
anonymized Youtube dataset comprising millions of user sequences. We quantify
the degree of long-range dependence in these sequences and demonstrate that
both short-term and long-term dependent behavioral patterns co-exist. We then
propose a neural Multi-temporal-range Mixture Model (M3) as a tailored solution
to deal with both short-term and long-term dependencies. Our approach employs a
mixture of models, each with a different temporal range. These models are
combined by a learned gating mechanism capable of exerting different model
combinations given different contextual information. In empirical evaluations
on a public dataset and our own anonymized YouTube dataset, M3 consistently
outperforms state-of-the-art sequential recommendation methods.Comment: Accepted at WWW 201
Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation
Sequential recommender systems aim to model users' evolving interests from
their historical behaviors, and hence make customized time-relevant
recommendations. Compared with traditional models, deep learning approaches
such as CNN and RNN have achieved remarkable advancements in recommendation
tasks. Recently, the BERT framework also emerges as a promising method,
benefited from its self-attention mechanism in processing sequential data.
However, one limitation of the original BERT framework is that it only
considers one input source of the natural language tokens. It is still an open
question to leverage various types of information under the BERT framework.
Nonetheless, it is intuitively appealing to utilize other side information,
such as item category or tag, for more comprehensive depictions and better
recommendations. In our pilot experiments, we found naive approaches, which
directly fuse types of side information into the item embeddings, usually bring
very little or even negative effects. Therefore, in this paper, we propose the
NOninVasive self-attention mechanism (NOVA) to leverage side information
effectively under the BERT framework. NOVA makes use of side information to
generate better attention distribution, rather than directly altering the item
embedding, which may cause information overwhelming. We validate the NOVA-BERT
model on both public and commercial datasets, and our method can stably
outperform the state-of-the-art models with negligible computational overheads.Comment: Accepted at AAAI 202