631 research outputs found
Product-based Neural Networks for User Response Prediction
Predicting user responses, such as clicks and conversions, is of great
importance and has found its usage in many Web applications including
recommender systems, web search and online advertising. The data in those
applications is mostly categorical and contains multiple fields; a typical
representation is to transform it into a high-dimensional sparse binary feature
representation via one-hot encoding. Facing with the extreme sparsity,
traditional models may limit their capacity of mining shallow patterns from the
data, i.e. low-order feature combinations. Deep models like deep neural
networks, on the other hand, cannot be directly applied for the
high-dimensional input because of the huge feature space. In this paper, we
propose a Product-based Neural Networks (PNN) with an embedding layer to learn
a distributed representation of the categorical data, a product layer to
capture interactive patterns between inter-field categories, and further fully
connected layers to explore high-order feature interactions. Our experimental
results on two large-scale real-world ad click datasets demonstrate that PNNs
consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Learning sophisticated feature interactions behind user behaviors is critical
in maximizing CTR for recommender systems. Despite great progress, existing
methods seem to have a strong bias towards low- or high-order interactions, or
require expertise feature engineering. In this paper, we show that it is
possible to derive an end-to-end learning model that emphasizes both low- and
high-order feature interactions. The proposed model, DeepFM, combines the power
of factorization machines for recommendation and deep learning for feature
learning in a new neural network architecture. Compared to the latest Wide \&
Deep model from Google, DeepFM has a shared input to its "wide" and "deep"
parts, with no need of feature engineering besides raw features. Comprehensive
experiments are conducted to demonstrate the effectiveness and efficiency of
DeepFM over the existing models for CTR prediction, on both benchmark data and
commercial data
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction
Click-Through Rate prediction is an important task in recommender systems,
which aims to estimate the probability of a user to click on a given item.
Recently, many deep models have been proposed to learn low-order and high-order
feature interactions from original features. However, since useful interactions
are always sparse, it is difficult for DNN to learn them effectively under a
large number of parameters. In real scenarios, artificial features are able to
improve the performance of deep models (such as Wide & Deep Learning), but
feature engineering is expensive and requires domain knowledge, making it
impractical in different scenarios. Therefore, it is necessary to augment
feature space automatically. In this paper, We propose a novel Feature
Generation by Convolutional Neural Network (FGCNN) model with two components:
Feature Generation and Deep Classifier. Feature Generation leverages the
strength of CNN to generate local patterns and recombine them to generate new
features. Deep Classifier adopts the structure of IPNN to learn interactions
from the augmented feature space. Experimental results on three large-scale
datasets show that FGCNN significantly outperforms nine state-of-the-art
models. Moreover, when applying some state-of-the-art models as Deep
Classifier, better performance is always achieved, showing the great
compatibility of our FGCNN model. This work explores a novel direction for CTR
predictions: it is quite useful to reduce the learning difficulties of DNN by
automatically identifying important features
Learning from Multi-View Multi-Way Data via Structural Factorization Machines
Real-world relations among entities can often be observed and determined by
different perspectives/views. For example, the decision made by a user on
whether to adopt an item relies on multiple aspects such as the contextual
information of the decision, the item's attributes, the user's profile and the
reviews given by other users. Different views may exhibit multi-way
interactions among entities and provide complementary information. In this
paper, we introduce a multi-tensor-based approach that can preserve the
underlying structure of multi-view data in a generic predictive model.
Specifically, we propose structural factorization machines (SFMs) that learn
the common latent spaces shared by multi-view tensors and automatically adjust
the importance of each view in the predictive model. Furthermore, the
complexity of SFMs is linear in the number of parameters, which make SFMs
suitable to large-scale problems. Extensive experiments on real-world datasets
demonstrate that the proposed SFMs outperform several state-of-the-art methods
in terms of prediction accuracy and computational cost.Comment: 10 page
- …