15 research outputs found

    Slowest and fastest coupon collectors

    Full text link
    In the coupon collector's problem, every cereal box contains one coupon from a collection of nn distinct coupons, each equally likely to appear. The goal is to find the expected number of boxes a player needs to purchase to complete the whole collection. In this work, we extend the classical problem to kk players, and find the expected number of boxes required for the slowest and fastest players to complete the whole collection. The probability that a particular player is the slowest or fastest player to finish will also be touched upon

    Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

    Full text link
    We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the ``Double Sum'' (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called ``Deep Embedding'' (DE) kernels that essentially consists in applying a radial kernel on Hilbert space on top of the canonical distance induced by another kernel such as a DS kernel. We establish in particular that while DS kernels typically suffer from a lack of strict positive definiteness, vast subclasses of DE kernels built upon DS kernels do possess this property, enabling in turn combinatorial optimization without requiring to introduce a jitter parameter. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results highlight the applicability and compared merits of the considered approaches while opening new perspectives in prediction and sequential design with set inputs

    No-feedback Card Guessing Game: Moments and distributions under the optimal strategy

    Full text link
    Relying on the optimal guessing strategy recently found for a no-feedback card guessing game with kk-time riffle shuffles, we derive an exact, closed-form formula for the expected number of correct guesses and higher moments for a 11-time shuffle case. Our approach makes use of the fast generating function based on a recurrence relation, the method of overlapping stages, and interpolation. As for k>1k>1-time shuffles, we establish the expected number of correct guesses through a self-contained combinatorial proof. The proof turns out to be the answer to an open problem listed in Krityakierne and Thanatipanonda (2022), asking for a combinatorial interpretation of a generating function object introduced therein

    Global Optimization Of Computationally Expensive Blackbox Problems Using Radial Basis Functions

    Full text link
    Three derivative-free global optimization methods are developed based on radial basis functions (RBFs) for computationally expensive blackbox simulation models. First, we develop a multistart global optimization method, called SOMS (SurrOgate MultiStart). SOMS uses an RBF surrogate model to approximate the objective function in order to reduce the number of function evaluations necessary to identify the most promising points from which each nonlinear programming local search is started. We show that SOMS detects any local minimum within a finite number of iterations almost surely. The numerical results show that SOMS performs favorably in comparison to alternative methods and that the surrogate approach saves a significant number of computationally expensive function evaluations. In the second part of this work, we introduce PADS (PArallel Dynamic coordinate search with Surrogates), which is a surrogate-based global optimization framework for highdimensional expensive blackbox functions. In each parallel iteration of PADS, multiple points are selected from a large set of candidate points that are generated by perturbing only a subset of the coordinates of the current best solution. The selected points are then evaluated in parallel with up to 16 parallel processors. We show that PADS converges to the global optimum with probability 1. We develop two versions, PADS1 and PADS2, which use different underlying distributions to generate candidate points. We show that PADS1 and PADS2 are able to find better solutions more efficiently compared to alternative methods, with PADS1 performing even better than PADS2 in problems up to 200 dimensions. In the final part of this dissertation, we develop an effective new parallel surrogate global optimization method called SOP (Surrogate Optimization with Pareto center selection). The search mechanism of SOP incorporates bi-objective optimization, tabu search, and surrogate assisted local search, which exploits the information from the already evaluated points, for selecting a large number of new evaluation points. The newly selected points are evaluated in parallel, and hence a significant reduction in wall-clock time can be achieved. We give sufficient conditions for almost sure convergence of SOP. The results of our numerical experiments show that SOP performs very well compared to alternative parallel surrogate model algorithms with 8 and 32 processors obtaining superlinear speedup on some test problems

    SOMS: SurrOgate MultiStart algorithm for use with nonlinear programming for global optimization

    No full text
    SOMS is a general surrogate-based multistart algorithm, which is used in combination with any local optimizer to find global optima for computationally expensive functions with multiple local minima. SOMS differs from previous multistart methods in that a surrogate approximation is used by the multistart algorithm to help reduce the number of function evaluations necessary to identify the most promising points from which to start each nonlinear programming local search. SOMS’s numerical results are compared with four well-known methods, namely, Multi-Level Single Linkage (MLSL), MATLAB’s MultiStart, MATLAB’s GlobalSearch, and GLOBAL. In addition, we propose a class of wavy test functions that mimic the wavy nature of objective functions arising in many black-box simulations. Extensive comparisons of algorithms on the wavy testfunctions and on earlier standard global-optimization test functions are done for a total of 19 different test problems. The numerical results indicate that SOMS performs favorably in comparison to alternative methods and does especially well on wavy functions when the number of function evaluations allowed is limited

    Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

    No full text
    We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the "Double Sum" (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called "Deep Embedding" (DE) kernels that essentially consists in applying a radial kernel on Hilbert space on top of the canonical distance induced by another kernel such as a DS kernel. We establish in particular that while DS kernels typically suffer from a lack of strict positive definiteness, vast subclasses of DE kernels built upon DS kernels do possess this property, enabling in turn combinatorial optimization without requiring to introduce a jitter parameter. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results highlight the applicability and compared merits of the considered approaches while opening new perspectives in prediction and sequential design with set inputs

    Contaminant source localization via Bayesian global optimization

    Get PDF
    Contaminant source localization problems require efficient and robust methods that can account for geological heterogeneities and accommodate relatively small data sets of noisy observations. As realism commands hi-fidelity simulations, computation costs call for global optimization algorithms under parsimonious evaluation budgets. Bayesian optimization approaches are well adapted to such settings as they allow the exploration of parameter spaces in a principled way so as to iteratively locate the point(s) of global optimum while maintaining an approximation of the objective function with an instrumental quantification of prediction uncertainty. Here, we adapt a Bayesian optimization approach to localize a contaminant source in a discretized spatial domain. We thus demonstrate the potential of such a method for hydrogeological applications and also provide test cases for the optimization community. The localization problem is illustrated for cases where the geology is assumed to be perfectly known. Two 2-D synthetic cases that display sharp hydraulic conductivity contrasts and specific connectivity patterns are investigated. These cases generate highly nonlinear objective functions that present multiple local minima. A derivative-free global optimization algorithm relying on a Gaussian process model and on the expected improvement criterion is used to efficiently localize the point of minimum of the objective functions, which corresponds to the contaminant source location. Even though concentration measurements contain a significant level of proportional noise, the algorithm efficiently localizes the contaminant source location. The variations of the objective function are essentially driven by the geology, followed by the design of the monitoring well network. The data and scripts used to generate objective functions are shared to favor reproducible research. This contribution is important because the functions present multiple local minima and are inspired from a practical field application. Sharing these complex objective functions provides a source of test cases for global optimization benchmarks and should help with designing new and efficient methods to solve this type of problem
    corecore