9,524 research outputs found
Sequential Selection of Correlated Ads by POMDPs
Online advertising has become a key source of revenue for both web search
engines and online publishers. For them, the ability of allocating right ads to
right webpages is critical because any mismatched ads would not only harm web
users' satisfactions but also lower the ad income. In this paper, we study how
online publishers could optimally select ads to maximize their ad incomes over
time. The conventional offline, content-based matching between webpages and ads
is a fine start but cannot solve the problem completely because good matching
does not necessarily lead to good payoff. Moreover, with the limited display
impressions, we need to balance the need of selecting ads to learn true ad
payoffs (exploration) with that of allocating ads to generate high immediate
payoffs based on the current belief (exploitation). In this paper, we address
the problem by employing Partially observable Markov decision processes
(POMDPs) and discuss how to utilize the correlation of ads to improve the
efficiency of the exploration and increase ad incomes in a long run. Our
mathematical derivation shows that the belief states of correlated ads can be
naturally updated using a formula similar to collaborative filtering. To test
our model, a real world ad dataset from a major search engine is collected and
categorized. Experimenting over the data, we provide an analyse of the effect
of the underlying parameters, and demonstrate that our algorithms significantly
outperform other strong baselines
Intent-Aware Contextual Recommendation System
Recommender systems take inputs from user history, use an internal ranking
algorithm to generate results and possibly optimize this ranking based on
feedback. However, often the recommender system is unaware of the actual intent
of the user and simply provides recommendations dynamically without properly
understanding the thought process of the user. An intelligent recommender
system is not only useful for the user but also for businesses which want to
learn the tendencies of their users. Finding out tendencies or intents of a
user is a difficult problem to solve.
Keeping this in mind, we sought out to create an intelligent system which
will keep track of the user's activity on a web-application as well as
determine the intent of the user in each session. We devised a way to encode
the user's activity through the sessions. Then, we have represented the
information seen by the user in a high dimensional format which is reduced to
lower dimensions using tensor factorization techniques. The aspect of intent
awareness (or scoring) is dealt with at this stage. Finally, combining the user
activity data with the contextual information gives the recommendation score.
The final recommendations are then ranked using filtering and collaborative
recommendation techniques to show the top-k recommendations to the user. A
provision for feedback is also envisioned in the current system which informs
the model to update the various weights in the recommender system. Our overall
model aims to combine both frequency-based and context-based recommendation
systems and quantify the intent of a user to provide better recommendations.
We ran experiments on real-world timestamped user activity data, in the
setting of recommending reports to the users of a business analytics tool and
the results are better than the baselines. We also tuned certain aspects of our
model to arrive at optimized results.Comment: Presented at the 5th International Workshop on Data Science and Big
Data Analytics (DSBDA), 17th IEEE International Conference on Data Mining
(ICDM) 2017; 8 pages; 4 figures; Due to the limitation "The abstract field
cannot be longer than 1,920 characters," the abstract appearing here is
slightly shorter than the one in the PDF fil
Real-time Bidding for Online Advertising: Measurement and Analysis
The real-time bidding (RTB), aka programmatic buying, has recently become the
fastest growing area in online advertising. Instead of bulking buying and
inventory-centric buying, RTB mimics stock exchanges and utilises computer
algorithms to automatically buy and sell ads in real-time; It uses per
impression context and targets the ads to specific people based on data about
them, and hence dramatically increases the effectiveness of display
advertising. In this paper, we provide an empirical analysis and measurement of
a production ad exchange. Using the data sampled from both demand and supply
side, we aim to provide first-hand insights into the emerging new impression
selling infrastructure and its bidding behaviours, and help identifying
research and design issues in such systems. From our study, we observed that
periodic patterns occur in various statistics including impressions, clicks,
bids, and conversion rates (both post-view and post-click), which suggest
time-dependent models would be appropriate for capturing the repeated patterns
in RTB. We also found that despite the claimed second price auction, the first
price payment in fact is accounted for 55.4% of total cost due to the
arrangement of the soft floor price. As such, we argue that the setting of soft
floor price in the current RTB systems puts advertisers in a less favourable
position. Furthermore, our analysis on the conversation rates shows that the
current bidding strategy is far less optimal, indicating the significant needs
for optimisation algorithms incorporating the facts such as the temporal
behaviours, the frequency and recency of the ad displays, which have not been
well considered in the past.Comment: Accepted by ADKDD '13 worksho
A Deep Recurrent Survival Model for Unbiased Ranking
Position bias is a critical problem in information retrieval when dealing with implicit yet biased user feedback data. Unbiased ranking methods typically rely on causality models and debias the user feedback through inverse propensity weighting. While practical, these methods still suffer from two major problems. First, when infer a user click, the impact of the contextual information, such as documents that have been examined, is often ignored. Second, only the position bias is considered but other issues resulted from user browsing behaviors are overlooked. In this paper, we propose an end-to-end Deep Recurrent Survival Ranking (DRSR), a unified framework to jointly model user's various behaviors, to (i) consider the rich contextual information in the ranking list; and (ii) address the hidden issues underlying user behaviors, i.e., to mine observe pattern in queries without any click (non-click queries), and to model tracking logs which cannot truly reflect the user browsing intents (untrusted observation). Specifically, we adopt a recurrent neural network to model the contextual information and estimates the conditional likelihood of user feedback at each position. We then incorporate survival analysis techniques with the probability chain rule to mathematically recover the unbiased joint probability of one user's various behaviors. DRSR can be easily incorporated with both point-wise and pair-wise learning objectives. The extensive experiments over two large-scale industrial datasets demonstrate the significant performance gains of our model comparing with the state-of-the-arts
Strongly Constrained Discrete Hashing
Learning to hash is a fundamental technique widely used in large-scale image retrieval. Most existing methods for learning to hash address the involved discrete optimization problem by the continuous relaxation of the binary constraint, which usually leads to large quantization errors and consequently suboptimal binary codes. A few discrete hashing methods have emerged recently. However, they either completely ignore some useful constraints (specifically the balance and decorrelation of hash bits) or just turn those constraints into regularizers that would make the optimization easier but less accurate. In this paper, we propose a novel supervised hashing method named Strongly Constrained Discrete Hashing (SCDH) which overcomes such limitations. It can learn the binary codes for all examples in the training set, and meanwhile obtain a hash function for unseen samples with the above mentioned constraints preserved. Although the model of SCDH is fairly sophisticated, we are able to find closed-form solutions to all of its optimization subproblems and thus design an efficient algorithm that converges quickly. In addition, we extend SCDH to a kernelized version SCDH K . Our experiments on three large benchmark datasets have demonstrated that not only can SCDH and SCDH K achieve substantially higher MAP scores than state-of-the-art baselines, but they train much faster than those that are also supervised as well
Deep Character-Level Click-Through Rate Prediction for Sponsored Search
Predicting the click-through rate of an advertisement is a critical component
of online advertising platforms. In sponsored search, the click-through rate
estimates the probability that a displayed advertisement is clicked by a user
after she submits a query to the search engine. Commercial search engines
typically rely on machine learning models trained with a large number of
features to make such predictions. This is inevitably requires a lot of
engineering efforts to define, compute, and select the appropriate features. In
this paper, we propose two novel approaches (one working at character level and
the other working at word level) that use deep convolutional neural networks to
predict the click-through rate of a query-advertisement pair. Specially, the
proposed architectures only consider the textual content appearing in a
query-advertisement pair as input, and produce as output a click-through rate
prediction. By comparing the character-level model with the word-level model,
we show that language representation can be learnt from scratch at character
level when trained on enough data. Through extensive experiments using billions
of query-advertisement pairs of a popular commercial search engine, we
demonstrate that both approaches significantly outperform a baseline model
built on well-selected text features and a state-of-the-art word2vec-based
approach. Finally, by combining the predictions of the deep models introduced
in this study with the prediction of the model in production of the same
commercial search engine, we significantly improve the accuracy and the
calibration of the click-through rate prediction of the production system.Comment: SIGIR2017, 10 page
- …