18 research outputs found

    Product-based Neural Networks for User Response Prediction

    Full text link
    Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising. The data in those applications is mostly categorical and contains multiple fields; a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201

    DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

    Full text link
    Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide \& Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data

    Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction

    Full text link
    Click-Through Rate prediction is an important task in recommender systems, which aims to estimate the probability of a user to click on a given item. Recently, many deep models have been proposed to learn low-order and high-order feature interactions from original features. However, since useful interactions are always sparse, it is difficult for DNN to learn them effectively under a large number of parameters. In real scenarios, artificial features are able to improve the performance of deep models (such as Wide & Deep Learning), but feature engineering is expensive and requires domain knowledge, making it impractical in different scenarios. Therefore, it is necessary to augment feature space automatically. In this paper, We propose a novel Feature Generation by Convolutional Neural Network (FGCNN) model with two components: Feature Generation and Deep Classifier. Feature Generation leverages the strength of CNN to generate local patterns and recombine them to generate new features. Deep Classifier adopts the structure of IPNN to learn interactions from the augmented feature space. Experimental results on three large-scale datasets show that FGCNN significantly outperforms nine state-of-the-art models. Moreover, when applying some state-of-the-art models as Deep Classifier, better performance is always achieved, showing the great compatibility of our FGCNN model. This work explores a novel direction for CTR predictions: it is quite useful to reduce the learning difficulties of DNN by automatically identifying important features

    Ngram-LSTM Open Rate Prediction Model (NLORP) and Error_accuracy@C metric: Simple effective, and easy to implement approach to predict open rates for marketing email

    Full text link
    Our generation has seen an exponential increase in digital tools adoption. One of the unique areas where digital tools have made an exponential foray is in the sphere of digital marketing, where goods and services have been extensively promoted through the use of digital advertisements. Following this growth, multiple companies have leveraged multiple apps and channels to display their brand identities to a significantly larger user base. This has resulted in products, worth billions of dollars to be sold online. Emails and push notifications have become critical channels to publish advertisement content, to proactively engage with their contacts. Several marketing tools provide a user interface for marketers to design Email and Push messages for digital marketing campaigns. Marketers are also given a predicted open rate for the entered subject line. For enabling marketers generate targeted subject lines, multiple machine learning techniques have been used in the recent past. In particular, deep learning techniques that have established good effectiveness and efficiency. However, these techniques require a sizable amount of labelled training data in order to get good results. The creation of such datasets, particularly those with subject lines that have a specific theme, is a challenging and time-consuming task. In this paper, we propose a novel Ngram and LSTM-based modeling approach (NLORPM) to predict open rates of entered subject lines that is easier to implement, has low prediction latency, and performs extremely well for sparse data. To assess the performance of this model, we also devise a new metric called 'Error_accuracy@C' which is simple to grasp and fully comprehensible to marketers

    Product-Based Neural Networks for User Response Prediction

    Get PDF
    Predicting user responses, such as clicks and conversions, is of great importance and has found its usage inmany Web applications including recommender systems, webs earch and online advertising. The data in those applications is mostly categorical and contains multiple fields, a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between interfieldcategories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two-large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics

    STEC: See-Through Transformer-based Encoder for CTR Prediction

    Full text link
    Click-Through Rate (CTR) prediction holds a pivotal place in online advertising and recommender systems since CTR prediction performance directly influences the overall satisfaction of the users and the revenue generated by companies. Even so, CTR prediction is still an active area of research since it involves accurately modelling the preferences of users based on sparse and high-dimensional features where the higher-order interactions of multiple features can lead to different outcomes. Most CTR prediction models have relied on a single fusion and interaction learning strategy. The few CTR prediction models that have utilized multiple interaction modelling strategies have treated each interaction to be self-contained. In this paper, we propose a novel model named STEC that reaps the benefits of multiple interaction learning approaches in a single unified architecture. Additionally, our model introduces residual connections from different orders of interactions which boosts the performance by allowing lower level interactions to directly affect the predictions. Through extensive experiments on four real-world datasets, we demonstrate that STEC outperforms existing state-of-the-art approaches for CTR prediction thanks to its greater expressive capabilities
    corecore