10 research outputs found
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Sponsored search represents a major source of revenue for web search engines.
This popular advertising model brings a unique possibility for advertisers to
target users' immediate intent communicated through a search query, usually by
displaying their ads alongside organic search results for queries deemed
relevant to their products or services. However, due to a large number of
unique queries it is challenging for advertisers to identify all such relevant
queries. For this reason search engines often provide a service of advanced
matching, which automatically finds additional relevant queries for advertisers
to bid on. We present a novel advanced matching approach based on the idea of
semantic embeddings of queries and ads. The embeddings were learned using a
large data set of user search sessions, consisting of search queries, clicked
ads and search links, while utilizing contextual information such as dwell time
and skipped ads. To address the large-scale nature of our problem, both in
terms of data and vocabulary size, we propose a novel distributed algorithm for
training of the embeddings. Finally, we present an approach for overcoming a
cold-start problem associated with new ads and queries. We report results of
editorial evaluation and online tests on actual search traffic. The results
show that our approach significantly outperforms baselines in terms of
relevance, coverage, and incremental revenue. Lastly, we open-source learned
query embeddings to be used by researchers in computational advertising and
related fields.Comment: 10 pages, 4 figures, 39th International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR 2016, Pisa, Ital
You Must Have Clicked on this Ad by Mistake! Data-Driven Identification of Accidental Clicks on Mobile Ads with Applications to Advertiser Cost Discounting and Click-Through Rate Prediction
In the cost per click (CPC) pricing model, an advertiser pays an ad network
only when a user clicks on an ad; in turn, the ad network gives a share of that
revenue to the publisher where the ad was impressed. Still, advertisers may be
unsatisfied with ad networks charging them for "valueless" clicks, or so-called
accidental clicks. [...] Charging advertisers for such clicks is detrimental in
the long term as the advertiser may decide to run their campaigns on other ad
networks. In addition, machine-learned click models trained to predict which ad
will bring the highest revenue may overestimate an ad click-through rate, and
as a consequence negatively impacting revenue for both the ad network and the
publisher. In this work, we propose a data-driven method to detect accidental
clicks from the perspective of the ad network. We collect observations of time
spent by users on a large set of ad landing pages - i.e., dwell time. We notice
that the majority of per-ad distributions of dwell time fit to a mixture of
distributions, where each component may correspond to a particular type of
clicks, the first one being accidental. We then estimate dwell time thresholds
of accidental clicks from that component. Using our method to identify
accidental clicks, we then propose a technique that smoothly discounts the
advertiser's cost of accidental clicks at billing time. Experiments conducted
on a large dataset of ads served on Yahoo mobile apps confirm that our
thresholds are stable over time, and revenue loss in the short term is
marginal. We also compare the performance of an existing machine-learned click
model trained on all ad clicks with that of the same model trained only on
non-accidental clicks. There, we observe an increase in both ad click-through
rate (+3.9%) and revenue (+0.2%) on ads served by the Yahoo Gemini network when
using the latter. [...
Review of Intent Diversity in Information Retrieval : Approaches, Models and Trends
The fast increasing volume of information databases made some difficulties for a user to find the information that they need. Its important for researchers to find the best method for challenging this problem. user intention detection can be used to increase the relevancies of information delivered from the information retrieval system. This research used a systematic mapping process to identify what area, approaches, and models that mostly used to detect user intention in information retrieval in four years later. the result of this research identified that item-based approach is still the most approach researched by researchers to identify intent diversity in information retrieval. The used of item-based approach still increasing from 2015 until 2017. 34% paper used topic models in their research. It means that Topic models still the necessary models explored by the researchers in this study
Keyword Targeting Optimization in Sponsored Search Advertising: Combining Selection and Matching
In sponsored search advertising (SSA), advertisers need to select keywords
and determine matching types for selected keywords simultaneously, i.e.,
keyword targeting. An optimal keyword targeting strategy guarantees reaching
the right population effectively. This paper aims to address the keyword
targeting problem, which is a challenging task because of the incomplete
information of historical advertising performance indices and the high
uncertainty in SSA environments. First, we construct a data distribution
estimation model and apply a Markov Chain Monte Carlo method to make inference
about unobserved indices (i.e., impression and click-through rate) over three
keyword matching types (i.e., broad, phrase and exact). Second, we formulate a
stochastic keyword targeting model (BB-KSM) combining operations of keyword
selection and keyword matching to maximize the expected profit under the chance
constraint of the budget, and develop a branch-and-bound algorithm
incorporating a stochastic simulation process for our keyword targeting model.
Finally, based on a realworld dataset collected from field reports and logs of
past SSA campaigns, computational experiments are conducted to evaluate the
performance of our keyword targeting strategy. Experimental results show that,
(a) BB-KSM outperforms seven baselines in terms of profit; (b) BB-KSM shows its
superiority as the budget increases, especially in situations with more
keywords and keyword combinations; (c) the proposed data distribution
estimation approach can effectively address the problem of incomplete
performance indices over the three matching types and in turn significantly
promotes the performance of keyword targeting decisions. This research makes
important contributions to the SSA literature and the results offer critical
insights into keyword management for SSA advertisers.Comment: 38 pages, 4 figures, 5 table
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Sponsored search represents a major source of revenue for web search engines. The advertising model brings a unique possibility for advertisers to target direct user intent communicated through a search query, usually done by displaying their ads alongside organic search results for queries deemed relevant to their products or services. However, due to a large number of unique queries, it is particularly challenging for advertisers to identify all relevant queries. For this reason search engines often provide a service of advanced matching, which automatically finds additional relevant queries for advertisers to bid on. We present a novel advance match approach based on the idea of semantic embeddings of queries and ads. The embeddings were learned using a large data set of user search sessions, consisting of search queries, clicked ads and search links, while utilizing contextual information such as dwell time and skipped ads. To address the large-scale nature of our problem, both in terms of data and vocabulary size, we propose a novel distributed algorithm for training of the embeddings. Finally, we present an approach for overcoming a cold-start problem associated with new ads and queries. We report results of editorial evaluation and online tests on actual search traffic. The results show that our approach significantly outperforms baselines in terms of relevance, coverage and incremental revenue. Lastly, as part of this study, we open sourced query embeddings that can be used to advance the field