79 research outputs found
Discovering Valuable Items from Massive Data
Suppose there is a large collection of items, each with an associated cost
and an inherent utility that is revealed only once we commit to selecting it.
Given a budget on the cumulative cost of the selected items, how can we pick a
subset of maximal value? This task generalizes several important problems such
as multi-arm bandits, active search and the knapsack problem. We present an
algorithm, GP-Select, which utilizes prior knowledge about similarity be- tween
items, expressed as a kernel function. GP-Select uses Gaussian process
prediction to balance exploration (estimating the unknown value of items) and
exploitation (selecting items of high value). We extend GP-Select to be able to
discover sets that simultaneously have high utility and are diverse. Our
preference for diversity can be specified as an arbitrary monotone submodular
function that quantifies the diminishing returns obtained when selecting
similar items. Furthermore, we exploit the structure of the model updates to
achieve an order of magnitude (up to 40X) speedup in our experiments without
resorting to approximations. We provide strong guarantees on the performance of
GP-Select and apply it to three real-world case studies of industrial
relevance: (1) Refreshing a repository of prices in a Global Distribution
System for the travel industry, (2) Identifying diverse, binding-affine
peptides in a vaccine de- sign task and (3) Maximizing clicks in a web-scale
recommender system by recommending items to users
The Price of Information in Combinatorial Optimization
Consider a network design application where we wish to lay down a
minimum-cost spanning tree in a given graph; however, we only have stochastic
information about the edge costs. To learn the precise cost of any edge, we
have to conduct a study that incurs a price. Our goal is to find a spanning
tree while minimizing the disutility, which is the sum of the tree cost and the
total price that we spend on the studies. In a different application, each edge
gives a stochastic reward value. Our goal is to find a spanning tree while
maximizing the utility, which is the tree reward minus the prices that we pay.
Situations such as the above two often arise in practice where we wish to
find a good solution to an optimization problem, but we start with only some
partial knowledge about the parameters of the problem. The missing information
can be found only after paying a probing price, which we call the price of
information. What strategy should we adopt to optimize our expected
utility/disutility?
A classical example of the above setting is Weitzman's "Pandora's box"
problem where we are given probability distributions on values of
independent random variables. The goal is to choose a single variable with a
large value, but we can find the actual outcomes only after paying a price. Our
work is a generalization of this model to other combinatorial optimization
problems such as matching, set cover, facility location, and prize-collecting
Steiner tree. We give a technique that reduces such problems to their non-price
counterparts, and use it to design exact/approximation algorithms to optimize
our utility/disutility. Our techniques extend to situations where there are
additional constraints on what parameters can be probed or when we can
simultaneously probe a subset of the parameters.Comment: SODA 201
Correlated Stochastic Knapsack with a Submodular Objective
We study the correlated stochastic knapsack problem of a submodular target function, with optional additional constraints. We utilize the multilinear extension of submodular function, and bundle it with an adaptation of the relaxed linear constraints from Ma [Mathematics of Operations Research, Volume 43(3), 2018] on correlated stochastic knapsack problem. The relaxation is then solved by the stochastic continuous greedy algorithm, and rounded by a novel method to fit the contention resolution scheme (Feldman et al. [FOCS 2011]). We obtain a pseudo-polynomial time (1 - 1/?e)/2 ? 0.1967 approximation algorithm with or without those additional constraints, eliminating the need of a key assumption and improving on the (1 - 1/?e)/2 ? 0.1106 approximation by Fukunaga et al. [AAAI 2019]
Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback
This paper investigates the problem of combinatorial multiarmed bandits with
stochastic submodular (in expectation) rewards and full-bandit delayed
feedback, where the delayed feedback is assumed to be composite and anonymous.
In other words, the delayed feedback is composed of components of rewards from
past actions, with unknown division among the sub-components. Three models of
delayed feedback: bounded adversarial, stochastic independent, and stochastic
conditionally independent are studied, and regret bounds are derived for each
of the delay models. Ignoring the problem dependent parameters, we show that
regret bound for all the delay models is for
time horizon , where is a delay parameter defined differently in the
three cases, thus demonstrating an additive term in regret with delay in all
the three delay models. The considered algorithm is demonstrated to outperform
other full-bandit approaches with delayed composite anonymous feedback
Pandora's Box Problem with Order Constraints
The Pandora's Box Problem, originally formalized by Weitzman in 1979, models
selection from set of random, alternative options, when evaluation is costly.
This includes, for example, the problem of hiring a skilled worker, where only
one hire can be made, but the evaluation of each candidate is an expensive
procedure. Weitzman showed that the Pandora's Box Problem admits an elegant,
simple solution, where the options are considered in decreasing order of
reservation value,i.e., the value that reduces to zero the expected marginal
gain for opening the box. We study for the first time this problem when order -
or precedence - constraints are imposed between the boxes. We show that,
despite the difficulty of defining reservation values for the boxes which take
into account both in-depth and in-breath exploration of the various options,
greedy optimal strategies exist and can be efficiently computed for tree-like
order constraints. We also prove that finding approximately optimal adaptive
search strategies is NP-hard when certain matroid constraints are used to
further restrict the set of boxes which may be opened, or when the order
constraints are given as reachability constraints on a DAG. We complement the
above result by giving approximate adaptive search strategies based on a
connection between optimal adaptive strategies and non-adaptive strategies with
bounded adaptivity gap for a carefully relaxed version of the problem
- …