1 research outputs found

    Making Simulated Annealing Sample Efficient for Discrete Stochastic Optimization

    Full text link
    We study the regret of simulated annealing (SA) based approaches to solving discrete stochastic optimization problems. The main theoretical conclusion is that the regret of the simulated annealing algorithm, with either noisy or noiseless observations, depends primarily upon the rate of the convergence of the associated Gibbs measure to the optimal states. In contrast to previous works, we show that SA does not need an increased estimation effort (number of \textit{pulls/samples} of the selected \textit{arm/solution} per round for a finite horizon nn) with noisy observations to converge in probability. By simple modifications, we can make the total number of samples per iteration required for convergence (in probability) to scale as O(n)\mathcal{O}\big(n). Additionally, we show that a simulated annealing inspired heuristic can solve the problem of stochastic multi-armed bandits (MAB), by which we mean that it suffers a O(logn)\mathcal{O}(\log \,n) regret. Thus, our contention is that SA should be considered as a viable candidate for inclusion into the family of efficient exploration heuristics for bandit and discrete stochastic optimization problems
    corecore