315 research outputs found
Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
User response prediction, which models the user preference w.r.t. the
presented items, plays a key role in online services. With two-decade rapid
development, nowadays the cumulated user behavior sequences on mature Internet
service platforms have become extremely long since the user's first
registration. Each user not only has intrinsic tastes, but also keeps changing
her personal interests during lifetime. Hence, it is challenging to handle such
lifelong sequential modeling for each individual user. Existing methodologies
for sequential modeling are only capable of dealing with relatively recent user
behaviors, which leaves huge space for modeling long-term especially lifelong
sequential patterns to facilitate user modeling. Moreover, one user's behavior
may be accounted for various previous behaviors within her whole online
activity history, i.e., long-term dependency with multi-scale sequential
patterns. In order to tackle these challenges, in this paper, we propose a
Hierarchical Periodic Memory Network for lifelong sequential modeling with
personalized memorization of sequential patterns for each user. The model also
adopts a hierarchical and periodical updating mechanism to capture multi-scale
sequential patterns of user interests while supporting the evolving user
behavior logs. The experimental results over three large-scale real-world
datasets have demonstrated the advantages of our proposed model with
significant improvement in user response prediction performance against the
state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets:
https://github.com/alimamarankgroup/HPM
Multi-Factors Aware Dual-Attentional Knowledge Tracing
With the increasing demands of personalized learning, knowledge tracing has
become important which traces students' knowledge states based on their
historical practices. Factor analysis methods mainly use two kinds of factors
which are separately related to students and questions to model students'
knowledge states. These methods use the total number of attempts of students to
model students' learning progress and hardly highlight the impact of the most
recent relevant practices. Besides, current factor analysis methods ignore rich
information contained in questions. In this paper, we propose Multi-Factors
Aware Dual-Attentional model (MF-DAKT) which enriches question representations
and utilizes multiple factors to model students' learning progress based on a
dual-attentional mechanism. More specifically, we propose a novel
student-related factor which records the most recent attempts on relevant
concepts of students to highlight the impact of recent exercises. To enrich
questions representations, we use a pre-training method to incorporate two
kinds of question information including questions' relation and difficulty
level. We also add a regularization term about questions' difficulty level to
restrict pre-trained question representations to fine-tuning during the process
of predicting students' performance. Moreover, we apply a dual-attentional
mechanism to differentiate contributions of factors and factor interactions to
final prediction in different practice records. At last, we conduct experiments
on several real-world datasets and results show that MF-DAKT can outperform
existing knowledge tracing methods. We also conduct several studies to validate
the effects of each component of MF-DAKT.Comment: Accepted by CIKM 2021, 10 pages, 10 figures, 6 table
Deep Interest Evolution Network for Click-Through Rate Prediction
Click-through rate~(CTR) prediction, whose goal is to estimate the
probability of the user clicks, has become one of the core tasks in advertising
systems. For CTR prediction model, it is necessary to capture the latent user
interest behind the user behavior data. Besides, considering the changing of
the external environment and the internal cognition, user interest evolves over
time dynamically. There are several CTR prediction methods for interest
modeling, while most of them regard the representation of behavior as the
interest directly, and lack specially modeling for latent interest behind the
concrete behavior. Moreover, few work consider the changing trend of interest.
In this paper, we propose a novel model, named Deep Interest Evolution
Network~(DIEN), for CTR prediction. Specifically, we design interest extractor
layer to capture temporal interests from history behavior sequence. At this
layer, we introduce an auxiliary loss to supervise interest extracting at each
step. As user interests are diverse, especially in the e-commerce system, we
propose interest evolving layer to capture interest evolving process that is
relative to the target item. At interest evolving layer, attention mechanism is
embedded into the sequential structure novelly, and the effects of relative
interests are strengthened during interest evolution. In the experiments on
both public and industrial datasets, DIEN significantly outperforms the
state-of-the-art solutions. Notably, DIEN has been deployed in the display
advertisement system of Taobao, and obtained 20.7\% improvement on CTR.Comment: 9 pages. Accepted by AAAI 201
AutoAttention: Automatic Field Pair Selection for Attention in User Behavior Modeling
In Click-through rate (CTR) prediction models, a user's interest is usually
represented as a fixed-length vector based on her history behaviors. Recently,
several methods are proposed to learn an attentive weight for each user
behavior and conduct weighted sum pooling. However, these methods only manually
select several fields from the target item side as the query to interact with
the behaviors, neglecting the other target item fields, as well as user and
context fields. Directly including all these fields in the attention may
introduce noise and deteriorate the performance. In this paper, we propose a
novel model named AutoAttention, which includes all item/user/context side
fields as the query, and assigns a learnable weight for each field pair between
behavior fields and query fields. Pruning on these field pairs via these
learnable weights lead to automatic field pair selection, so as to identify and
remove noisy field pairs. Though including more fields, the computation cost of
AutoAttention is still low due to using a simple attention function and field
pair selection. Extensive experiments on the public dataset and Tencent's
production dataset demonstrate the effectiveness of the proposed approach.Comment: Accepted by ICDM 202
Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions
Various factorization-based methods have been proposed to leverage
second-order, or higher-order cross features for boosting the performance of
predictive models. They generally enumerate all the cross features under a
predefined maximum order, and then identify useful feature interactions through
model training, which suffer from two drawbacks. First, they have to make a
trade-off between the expressiveness of higher-order cross features and the
computational cost, resulting in suboptimal predictions. Second, enumerating
all the cross features, including irrelevant ones, may introduce noisy feature
combinations that degrade model performance. In this work, we propose the
Adaptive Factorization Network (AFN), a new model that learns arbitrary-order
cross features adaptively from data. The core of AFN is a logarithmic
transformation layer to convert the power of each feature in a feature
combination into the coefficient to be learned. The experimental results on
four real datasets demonstrate the superior predictive performance of AFN
against the start-of-the-arts.Comment: Accepted by AAAI'2
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction
Click-through rate (CTR) prediction is widely used in academia and industry.
Most CTR tasks fall into a feature embedding \& feature interaction paradigm,
where the accuracy of CTR prediction is mainly improved by designing practical
feature interaction structures. However, recent studies have argued that the
fixed feature embedding learned only through the embedding layer limits the
performance of existing CTR models. Some works apply extra modules on top of
the embedding layer to dynamically refine feature representations in different
instances, making it effective and easy to integrate with existing CTR methods.
Despite the promising results, there is a lack of a systematic review and
summarization of this new promising direction on the CTR task. To fill this
gap, we comprehensively summarize and define a new module, namely
\textbf{feature refinement} (FR) module, that can be applied between feature
embedding and interaction layers. We extract 14 FR modules from previous works,
including instances where the FR module was proposed but not clearly defined or
explained. We fully assess the effectiveness and compatibility of existing FR
modules through comprehensive and extensive experiments with over 200 augmented
models and over 4,000 runs for more than 15,000 GPU hours. The results offer
insightful guidelines for researchers, and all benchmarking code and
experimental results are open-sourced. In addition, we present a new
architecture of assigning independent FR modules to separate sub-networks for
parallel CTR models, as opposed to the conventional method of inserting a
shared FR module on top of the embedding layer. Our approach is also supported
by comprehensive experiments demonstrating its effectiveness
Enhancing CTR Prediction with Context-Aware Feature Representation Learning
CTR prediction has been widely used in the real world. Many methods model
feature interaction to improve their performance. However, most methods only
learn a fixed representation for each feature without considering the varying
importance of each feature under different contexts, resulting in inferior
performance. Recently, several methods tried to learn vector-level weights for
feature representations to address the fixed representation issue. However,
they only produce linear transformations to refine the fixed feature
representations, which are still not flexible enough to capture the varying
importance of each feature under different contexts. In this paper, we propose
a novel module named Feature Refinement Network (FRNet), which learns
context-aware feature representations at bit-level for each feature in
different contexts. FRNet consists of two key components: 1) Information
Extraction Unit (IEU), which captures contextual information and cross-feature
relationships to guide context-aware feature refinement; and 2) Complementary
Selection Gate (CSGate), which adaptively integrates the original and
complementary feature representations learned in IEU with bit-level weights.
Notably, FRNet is orthogonal to existing CTR methods and thus can be applied in
many existing methods to boost their performance. Comprehensive experiments are
conducted to verify the effectiveness, efficiency, and compatibility of FRNet.Comment: SIGIR 202
- …