3,411 research outputs found
A Bandit Approach to Maximum Inner Product Search
There has been substantial research on sub-linear time approximate algorithms
for Maximum Inner Product Search (MIPS). To achieve fast query time,
state-of-the-art techniques require significant preprocessing, which can be a
burden when the number of subsequent queries is not sufficiently large to
amortize the cost. Furthermore, existing methods do not have the ability to
directly control the suboptimality of their approximate results with
theoretical guarantees. In this paper, we propose the first approximate
algorithm for MIPS that does not require any preprocessing, and allows users to
control and bound the suboptimality of the results. We cast MIPS as a Best Arm
Identification problem, and introduce a new bandit setting that can fully
exploit the special structure of MIPS. Our approach outperforms
state-of-the-art methods on both synthetic and real-world datasets.Comment: AAAI 201
Adapting to the Shifting Intent of Search Queries
Search engines today present results that are often oblivious to abrupt
shifts in intent. For example, the query `independence day' usually refers to a
US holiday, but the intent of this query abruptly changed during the release of
a major film by that name. While no studies exactly quantify the magnitude of
intent-shifting traffic, studies suggest that news events, seasonal topics, pop
culture, etc account for 50% of all search queries. This paper shows that the
signals a search engine receives can be used to both determine that a shift in
intent has happened, as well as find a result that is now more relevant. We
present a meta-algorithm that marries a classifier with a bandit algorithm to
achieve regret that depends logarithmically on the number of query impressions,
under certain assumptions. We provide strong evidence that this regret is close
to the best achievable. Finally, via a series of experiments, we demonstrate
that our algorithm outperforms prior approaches, particularly as the amount of
intent-shifting traffic increases.Comment: This is the full version of the paper in NIPS'0
Dynamic Assortment Optimization with Changing Contextual Information
In this paper, we study the dynamic assortment optimization problem under a
finite selling season of length . At each time period, the seller offers an
arriving customer an assortment of substitutable products under a cardinality
constraint, and the customer makes the purchase among offered products
according to a discrete choice model. Most existing work associates each
product with a real-valued fixed mean utility and assumes a multinomial logit
choice (MNL) model. In many practical applications, feature/contexutal
information of products is readily available. In this paper, we incorporate the
feature information by assuming a linear relationship between the mean utility
and the feature. In addition, we allow the feature information of products to
change over time so that the underlying choice model can also be
non-stationary. To solve the dynamic assortment optimization under this
changing contextual MNL model, we need to simultaneously learn the underlying
unknown coefficient and makes the decision on the assortment. To this end, we
develop an upper confidence bound (UCB) based policy and establish the regret
bound on the order of , where is the dimension of
the feature and suppresses logarithmic dependence. We further
established the lower bound where is the cardinality
constraint of an offered assortment, which is usually small. When is a
constant, our policy is optimal up to logarithmic factors. In the exploitation
phase of the UCB algorithm, we need to solve a combinatorial optimization for
assortment optimization based on the learned information. We further develop an
approximation algorithm and an efficient greedy heuristic. The effectiveness of
the proposed policy is further demonstrated by our numerical studies.Comment: 4 pages, 4 figures. Minor revision and polishing of presentatio
Discovering Valuable Items from Massive Data
Suppose there is a large collection of items, each with an associated cost
and an inherent utility that is revealed only once we commit to selecting it.
Given a budget on the cumulative cost of the selected items, how can we pick a
subset of maximal value? This task generalizes several important problems such
as multi-arm bandits, active search and the knapsack problem. We present an
algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween
items, expressed as a kernel function. GP-Select uses Gaussian process
prediction to balance exploration (estimating the unknown value of items) and
exploitation (selecting items of high value). We extend GP-Select to be able to
discover sets that simultaneously have high utility and are diverse. Our
preference for diversity can be specified as an arbitrary monotone submodular
function that quantifies the diminishing returns obtained when selecting
similar items. Furthermore, we exploit the structure of the model updates to
achieve an order of magnitude (up to 40X) speedup in our experiments without
resorting to approximations. We provide strong guarantees on the performance of
GP-Select and apply it to three real-world case studies of industrial
relevance: (1) Refreshing a repository of prices in a Global Distribution
System for the travel industry, (2) Identifying diverse, binding-affine
peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale
recommender system by recommending items to users
- …