508 research outputs found
Optimistic planning for the stochastic knapsack problem
The stochastic knapsack problem is a stochastic resource allocation problem that arises frequently and yet is exceptionally hard to solve. We derive and study an optimistic planning algorithm specifically designed for the stochastic knapsack problem. Unlike other optimistic planning algorithms for MDPs, our algorithm, OpStoK, avoids the use of discounting and is adaptive to the amount of resources available. We achieve this behavior by means of a concentration inequality that simultaneously applies to capacity and reward estimates. Crucially, we are able to guarantee that the aforementioned confidence regions hold collectively over all time steps by an application of Doobâs inequality. We demonstrate that the method returns an ΔΔ-optimal solution to the stochastic knapsack problem with high probability. To the best of our knowledge, our algorithm is the first which provides such guarantees for the stochastic knapsack problem. Furthermore, our algorithm is an anytime algorithm and will return a good solution even if stopped prematurely. This is particularly important given the difficulty of the problem. We also provide theoretical conditions to guarantee OpStoK does not expand all policies and demonstrate favorable performance in a simple experimental setting
Stochastic Combinatorial Optimization via Poisson Approximation
We study several stochastic combinatorial problems, including the expected
utility maximization problem, the stochastic knapsack problem and the
stochastic bin packing problem. A common technical challenge in these problems
is to optimize some function of the sum of a set of random variables. The
difficulty is mainly due to the fact that the probability distribution of the
sum is the convolution of a set of distributions, which is not an easy
objective function to work with. To tackle this difficulty, we introduce the
Poisson approximation technique. The technique is based on the Poisson
approximation theorem discovered by Le Cam, which enables us to approximate the
distribution of the sum of a set of random variables using a compound Poisson
distribution.
We first study the expected utility maximization problem introduced recently
[Li and Despande, FOCS11]. For monotone and Lipschitz utility functions, we
obtain an additive PTAS if there is a multidimensional PTAS for the
multi-objective version of the problem, strictly generalizing the previous
result.
For the stochastic bin packing problem (introduced in [Kleinberg, Rabani and
Tardos, STOC97]), we show there is a polynomial time algorithm which uses at
most the optimal number of bins, if we relax the size of each bin and the
overflow probability by eps.
For stochastic knapsack, we show a 1+eps-approximation using eps extra
capacity, even when the size and reward of each item may be correlated and
cancelations of items are allowed. This generalizes the previous work [Balghat,
Goel and Khanna, SODA11] for the case without correlation and cancelation. Our
algorithm is also simpler. We also present a factor 2+eps approximation
algorithm for stochastic knapsack with cancelations. the current known
approximation factor of 8 [Gupta, Krishnaswamy, Molinaro and Ravi, FOCS11].Comment: 42 pages, 1 figure, Preliminary version appears in the Proceeding of
the 45th ACM Symposium on the Theory of Computing (STOC13
Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits
In the stochastic knapsack problem, we are given a knapsack of size B, and a
set of jobs whose sizes and rewards are drawn from a known probability
distribution. However, we know the actual size and reward only when the job
completes. How should we schedule jobs to maximize the expected total reward?
We know O(1)-approximations when we assume that (i) rewards and sizes are
independent random variables, and (ii) we cannot prematurely cancel jobs. What
can we say when either or both of these assumptions are changed?
The stochastic knapsack problem is of interest in its own right, but
techniques developed for it are applicable to other stochastic packing
problems. Indeed, ideas for this problem have been useful for budgeted learning
problems, where one is given several arms which evolve in a specified
stochastic fashion with each pull, and the goal is to pull the arms a total of
B times to maximize the reward obtained. Much recent work on this problem focus
on the case when the evolution of the arms follows a martingale, i.e., when the
expected reward from the future is the same as the reward at the current state.
What can we say when the rewards do not form a martingale?
In this paper, we give constant-factor approximation algorithms for the
stochastic knapsack problem with correlations and/or cancellations, and also
for budgeted learning problems where the martingale condition is not satisfied.
Indeed, we can show that previously proposed LP relaxations have large
integrality gaps. We propose new time-indexed LP relaxations, and convert the
fractional solutions into distributions over strategies, and then use the LP
values and the time ordering information from these strategies to devise a
randomized adaptive scheduling algorithm. We hope our LP formulation and
decomposition methods may provide a new way to address other correlated bandit
problems with more general contexts
Structural properties of optimal coordinate-convex policies for CAC with nonlinearly-constrained feasibility regions
Necessary optimality conditions for Call Admission Control (CAC) problems with nonlinearly-constrained feasibility regions and two classes of users are derived. The policies are restricted to the class of coordinate-convex policies. Two kinds of structural properties of the optimal policies and their robustness with respect to changes of the feasibility region are investigated: 1) general properties not depending on the revenue ratio associated with the two classes of users and 2) more specific properties depending on such a ratio. The results allow one to narrow the search for the optimal policies to a suitable subset of the set of coordinate-convex policies
- âŠ