4,104 research outputs found

    Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits

    Full text link
    In the stochastic knapsack problem, we are given a knapsack of size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, we know the actual size and reward only when the job completes. How should we schedule jobs to maximize the expected total reward? We know O(1)-approximations when we assume that (i) rewards and sizes are independent random variables, and (ii) we cannot prematurely cancel jobs. What can we say when either or both of these assumptions are changed? The stochastic knapsack problem is of interest in its own right, but techniques developed for it are applicable to other stochastic packing problems. Indeed, ideas for this problem have been useful for budgeted learning problems, where one is given several arms which evolve in a specified stochastic fashion with each pull, and the goal is to pull the arms a total of B times to maximize the reward obtained. Much recent work on this problem focus on the case when the evolution of the arms follows a martingale, i.e., when the expected reward from the future is the same as the reward at the current state. What can we say when the rewards do not form a martingale? In this paper, we give constant-factor approximation algorithms for the stochastic knapsack problem with correlations and/or cancellations, and also for budgeted learning problems where the martingale condition is not satisfied. Indeed, we can show that previously proposed LP relaxations have large integrality gaps. We propose new time-indexed LP relaxations, and convert the fractional solutions into distributions over strategies, and then use the LP values and the time ordering information from these strategies to devise a randomized adaptive scheduling algorithm. We hope our LP formulation and decomposition methods may provide a new way to address other correlated bandit problems with more general contexts

    Algorithms and Adaptivity Gaps for Stochastic k-TSP

    Get PDF
    Given a metric (V,d)(V,d) and a rootV\textsf{root} \in V, the classic \textsf{k-TSP} problem is to find a tour originating at the root\textsf{root} of minimum length that visits at least kk nodes in VV. In this work, motivated by applications where the input to an optimization problem is uncertain, we study two stochastic versions of \textsf{k-TSP}. In Stoch-Reward kk-TSP, originally defined by Ene-Nagarajan-Saket [ENS17], each vertex vv in the given metric (V,d)(V,d) contains a stochastic reward RvR_v. The goal is to adaptively find a tour of minimum expected length that collects at least reward kk; here "adaptively" means our next decision may depend on previous outcomes. Ene et al. give an O(logk)O(\log k)-approximation adaptive algorithm for this problem, and left open if there is an O(1)O(1)-approximation algorithm. We totally resolve their open question and even give an O(1)O(1)-approximation \emph{non-adaptive} algorithm for this problem. We also introduce and obtain similar results for the Stoch-Cost kk-TSP problem. In this problem each vertex vv has a stochastic cost CvC_v, and the goal is to visit and select at least kk vertices to minimize the expected \emph{sum} of tour length and cost of selected vertices. This problem generalizes the Price of Information framework [Singla18] from deterministic probing costs to metric probing costs. Our techniques are based on two crucial ideas: "repetitions" and "critical scaling". We show using Freedman's and Jogdeo-Samuels' inequalities that for our problems, if we truncate the random variables at an ideal threshold and repeat, then their expected values form a good surrogate. Unfortunately, this ideal threshold is adaptive as it depends on how far we are from achieving our target kk, so we truncate at various different scales and identify a "critical" scale.Comment: ITCS 202

    Surrogate Assisted Optimisation for Travelling Thief Problems

    Full text link
    The travelling thief problem (TTP) is a multi-component optimisation problem involving two interdependent NP-hard components: the travelling salesman problem (TSP) and the knapsack problem (KP). Recent state-of-the-art TTP solvers modify the underlying TSP and KP solutions in an iterative and interleaved fashion. The TSP solution (cyclic tour) is typically changed in a deterministic way, while changes to the KP solution typically involve a random search, effectively resulting in a quasi-meandering exploration of the TTP solution space. Once a plateau is reached, the iterative search of the TTP solution space is restarted by using a new initial TSP tour. We propose to make the search more efficient through an adaptive surrogate model (based on a customised form of Support Vector Regression) that learns the characteristics of initial TSP tours that lead to good TTP solutions. The model is used to filter out non-promising initial TSP tours, in effect reducing the amount of time spent to find a good TTP solution. Experiments on a broad range of benchmark TTP instances indicate that the proposed approach filters out a considerable number of non-promising initial tours, at the cost of omitting only a small number of the best TTP solutions

    Caching with Partial Adaptive Matching

    Full text link
    We study the caching problem when we are allowed to match each user to one of a subset of caches after its request is revealed. We focus on non-uniformly popular content, specifically when the file popularities obey a Zipf distribution. We study two extremal schemes, one focusing on coded server transmissions while ignoring matching capabilities, and the other focusing on adaptive matching while ignoring potential coding opportunities. We derive the rates achieved by these schemes and characterize the regimes in which one outperforms the other. We also compare them to information-theoretic outer bounds, and finally propose a hybrid scheme that generalizes ideas from the two schemes and performs at least as well as either of them in most memory regimes.Comment: 35 pages, 7 figures. Shorter versions have appeared in IEEE ISIT 2017 and IEEE ITW 201

    Natural Selection of Paths in Networks

    Get PDF
    We present a novel algorithm that exhibits natural selection of paths in a network. If each node and weighted directed edge has a unique identifier, a path in the network is defined as an ordered list of these unique identifiers. We take a population perspective and view each path as a genotype. If each node has a node phenotype then a path phenotype is defined as the list of node phenotypes in order of traversal. We show that given appropriate path traversal, weight change and structural plasticity rules, a path is a unit of evolution because it can exhibit multiplicative growth (i.e. change it’s probability of being traversed), and have variation and heredity. Thus, a unit of evolution need not be a spatially distinct physical individual. The total set of paths in a network consists of all possible paths from the start node to a finish node. Each path phenotype is associated with a reward that determines whether the edges of that path will be multiplicatively strengthened (or weakened). A pair-wise tournament selection algorithm is implemented which compares the reward obtained by two paths. The directed edges of the winning path are strengthened, whilst the directed edges of the losing path are weakened. Edges shared by both paths are not changed (or weakened if diversity is desired). Each time a node is activated there is a probability that the path will mutate, i.e. find an alternative route that bypasses that node. This generates the potential for a novel but correlated path with a novel but correlated phenotype. By this process the more frequently traversed paths are responsible for most of the exploration. Nodes that are inactive for some period of time are lost (which is equivalent to connections to and from them being broken). This network-based natural selection compares favourably with a standard pair-wise tournament-selection based genetic algorithm on a range of combinatorial optimization problems and continuous parametric optimization problems. The network also exhibits memory of past selective environments and can store previously discovered characters for reuse in later optimization tasks. The pathway evolution algorithm has several possible implementations and permits natural selection with unlimited heredity without template replication
    corecore