640 research outputs found
Network On Network for Tabular Data Classification in Real-world Applications
Tabular data is the most common data format adopted by our customers ranging
from retail, finance to E-commerce, and tabular data classification plays an
essential role to their businesses. In this paper, we present Network On
Network (NON), a practical tabular data classification model based on deep
neural network to provide accurate predictions. Various deep methods have been
proposed and promising progress has been made. However, most of them use
operations like neural network and factorization machines to fuse the
embeddings of different features directly, and linearly combine the outputs of
those operations to get the final prediction. As a result, the intra-field
information and the non-linear interactions between those operations (e.g.
neural network and factorization machines) are ignored. Intra-field information
is the information that features inside each field belong to the same field.
NON is proposed to take full advantage of intra-field information and
non-linear interactions. It consists of three components: field-wise network at
the bottom to capture the intra-field information, across field network in the
middle to choose suitable operations data-drivenly, and operation fusion
network on the top to fuse outputs of the chosen operations deeply. Extensive
experiments on six real-world datasets demonstrate NON can outperform the
state-of-the-art models significantly. Furthermore, both qualitative and
quantitative study of the features in the embedding space show NON can capture
intra-field information effectively
Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
User response prediction, which models the user preference w.r.t. the
presented items, plays a key role in online services. With two-decade rapid
development, nowadays the cumulated user behavior sequences on mature Internet
service platforms have become extremely long since the user's first
registration. Each user not only has intrinsic tastes, but also keeps changing
her personal interests during lifetime. Hence, it is challenging to handle such
lifelong sequential modeling for each individual user. Existing methodologies
for sequential modeling are only capable of dealing with relatively recent user
behaviors, which leaves huge space for modeling long-term especially lifelong
sequential patterns to facilitate user modeling. Moreover, one user's behavior
may be accounted for various previous behaviors within her whole online
activity history, i.e., long-term dependency with multi-scale sequential
patterns. In order to tackle these challenges, in this paper, we propose a
Hierarchical Periodic Memory Network for lifelong sequential modeling with
personalized memorization of sequential patterns for each user. The model also
adopts a hierarchical and periodical updating mechanism to capture multi-scale
sequential patterns of user interests while supporting the evolving user
behavior logs. The experimental results over three large-scale real-world
datasets have demonstrated the advantages of our proposed model with
significant improvement in user response prediction performance against the
state-of-the-arts.Comment: SIGIR 2019. Reproducible codes and datasets:
https://github.com/alimamarankgroup/HPM
STEC: See-Through Transformer-based Encoder for CTR Prediction
Click-Through Rate (CTR) prediction holds a pivotal place in online
advertising and recommender systems since CTR prediction performance directly
influences the overall satisfaction of the users and the revenue generated by
companies. Even so, CTR prediction is still an active area of research since it
involves accurately modelling the preferences of users based on sparse and
high-dimensional features where the higher-order interactions of multiple
features can lead to different outcomes. Most CTR prediction models have relied
on a single fusion and interaction learning strategy. The few CTR prediction
models that have utilized multiple interaction modelling strategies have
treated each interaction to be self-contained. In this paper, we propose a
novel model named STEC that reaps the benefits of multiple interaction learning
approaches in a single unified architecture. Additionally, our model introduces
residual connections from different orders of interactions which boosts the
performance by allowing lower level interactions to directly affect the
predictions. Through extensive experiments on four real-world datasets, we
demonstrate that STEC outperforms existing state-of-the-art approaches for CTR
prediction thanks to its greater expressive capabilities
- …