741 research outputs found

    Replication or exploration? Sequential design for stochastic simulation experiments

    Full text link
    We investigate the merits of replication, and provide methods for optimal design (including replicates), with the goal of obtaining globally accurate emulation of noisy computer simulation experiments. We first show that replication can be beneficial from both design and computational perspectives, in the context of Gaussian process surrogate modeling. We then develop a lookahead based sequential design scheme that can determine if a new run should be at an existing input location (i.e., replicate) or at a new one (explore). When paired with a newly developed heteroskedastic Gaussian process model, our dynamic design scheme facilitates learning of signal and noise relationships which can vary throughout the input space. We show that it does so efficiently, on both computational and statistical grounds. In addition to illustrative synthetic examples, we demonstrate performance on two challenging real-data simulation experiments, from inventory management and epidemiology.Comment: 34 pages, 9 figure

    Optimal Auctions for Correlated Buyers with Sampling

    Full text link
    Cr\'emer and McLean [1985] showed that, when buyers' valuations are drawn from a correlated distribution, an auction with full knowledge on the distribution can extract the full social surplus. We study whether this phenomenon persists when the auctioneer has only incomplete knowledge of the distribution, represented by a finite family of candidate distributions, and has sample access to the real distribution. We show that the naive approach which uses samples to distinguish candidate distributions may fail, whereas an extended version of the Cr\'emer-McLean auction simultaneously extracts full social surplus under each candidate distribution. With an algebraic argument, we give a tight bound on the number of samples needed by this auction, which is the difference between the number of candidate distributions and the dimension of the linear space they span

    Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

    Get PDF
    Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees and penalty functions mitigating risk. This mixed-integer optimization problem with convex penalty terms broadly applies to optimizing pre-trained regression tree models. Decision makers may wish to optimize discrete models to repurpose legacy predictive models, or they may wish to optimize a discrete model that particularly well-represents a data set. We develop several heuristic methods to find feasible solutions, and an exact, branch-and-bound algorithm leveraging structural properties of the gradient-boosted trees and penalty functions. We computationally test our methods on concrete mixture design instance and a chemical catalysis industrial instance

    Task-based Runtime Optimizations Towards High Performance Computing Applications

    Get PDF
    The last decades have witnessed a rapid improvement of computational capabilities in high-performance computing (HPC) platforms thanks to hardware technology scaling. HPC architectures benefit from mainstream advances on the hardware with many-core systems, deep hierarchical memory subsystem, non-uniform memory access, and an ever-increasing gap between computational power and memory bandwidth. This has necessitated continuous adaptations across the software stack to maintain high hardware utilization. In this HPC landscape of potentially million-way parallelism, task-based programming models associated with dynamic runtime systems are becoming more popular, which fosters developers’ productivity at extreme scale by abstracting the underlying hardware complexity. In this context, this dissertation highlights how a software bundle powered by a task-based programming model can address the heterogeneous workloads engendered by HPC applications., i.e., data redistribution, geospatial modeling and 3D unstructured mesh deformation here. Data redistribution aims to reshuffle data to optimize some objective for an algorithm, whose objective can be multi-dimensional, such as improving computational load balance or decreasing communication volume or cost, with the ultimate goal of increasing the efficiency and therefore reducing the time-to-solution for the algorithm. Geostatistical modeling, one of the prime motivating applications for exascale computing, is a technique for predicting desired quantities from geographically distributed data, based on statistical models and optimization of parameters. Meshing the deformable contour of moving 3D bodies is an expensive operation that can cause huge computational challenges in fluid-structure interaction (FSI) applications. Therefore, in this dissertation, Redistribute-PaRSEC, ExaGeoStat-PaRSEC and HiCMA-PaRSEC are proposed to efficiently tackle these HPC applications respectively at extreme scale, and they are evaluated on multiple HPC clusters, including AMD-based, Intel-based, Arm-based CPU systems and IBM-based multi-GPU system. This multidisciplinary work emphasizes the need for runtime systems to go beyond their primary responsibility of task scheduling on massively parallel hardware system for servicing the next-generation scientific applications

    Multiple-objective sensor management and optimisation

    No full text
    One of the key challenges associated with exploiting modern Autonomous Vehicle technology for military surveillance tasks is the development of Sensor Management strategies which maximise the performance of the on-board Data-Fusion systems. The focus of this thesis is the development of Sensor Management algorithms which aim to optimise target tracking processes. Three principal theoretical and analytical contributions are presented which are related to the manner in which such problems are formulated and subsequently solved.Firstly, the trade-offs between optimising target tracking and other system-level objectives relating to expected operating lifetime are explored in an autonomous ground sensor scenario. This is achieved by modelling the observer trajectory control design as a probabilistic, information-theoretic, multiple-objective optimisation problem. This novel approach explores the relationships between the changes in sensor-target geometry that are induced by tracking performance measures and those relating to power consumption. This culminates in a novel observer trajectory control algorithm based onthe minimax approach.The second contribution is an analysis of the propagation of error through a limited-lookahead sensor control feedback loop. In the last decade, it has been shown that the use of such non-myopic (multiple-step) planning strategies can lead to superior performance in many Sensor Management scenarios. However, relatively little is known about the performance of strategies which use different horizon lengths. It is shown that, in the general case, planning performance is a function of the length of the horizon over which the optimisation is performed. While increasing the horizon maximises the chances of achieving global optimality, by revealing information about the substructureof the decision space, it also increases the impact of any prediction error, approximations, or unforeseen risk present within the scenario. These competing mechanisms aredemonstrated using an example tracking problem. This provides the motivation for a novel sensor control methodology that employs an adaptive length optimisation horizon. A route to selecting the optimal horizon size is proposed, based on a new non-myopic risk equilibrium which identifies the point where the two competing mechanisms are balanced.The third area of contribution concerns the development of a number of novel optimisation algorithms aimed at solving the resulting sequential decision making problems. These problems are typically solved using stochastic search methods such as Genetic Algorithms or Simulated Annealing. The techniques presented in this thesis are extensions of the recently proposed Repeated Weighted Boosting Search algorithm. In its originalform, it is only applicable to continuous, single-objective, ptimisation problems. The extensions facilitate application to mixed search spaces and Pareto multiple-objective problems. The resulting algorithms have performance comparable with Genetic Algorithm variants, and offer a number of advantages such as ease of implementation and limited tuning requirements

    Extensions of Task-based Runtime for High Performance Dense Linear Algebra Applications

    Get PDF
    On the road to exascale computing, the gap between hardware peak performance and application performance is increasing as system scale, chip density and inherent complexity of modern supercomputers are expanding. Even if we put aside the difficulty to express algorithmic parallelism and to efficiently execute applications at large scale, other open questions remain. The ever-growing scale of modern supercomputers induces a fast decline of the Mean Time To Failure. A generic, low-overhead, resilient extension becomes a desired aptitude for any programming paradigm. This dissertation addresses these two critical issues, designing an efficient unified linear algebra development environment using a task-based runtime, and extending a task-based runtime with fault tolerant capabilities to build a generic framework providing both soft and hard error resilience to task-based programming paradigm. To bridge the gap between hardware peak performance and application perfor- mance, a unified programming model is designed to take advantage of a lightweight task-based runtime to manage the resource-specific workload, and to control the data ow and parallel execution of tasks. Under this unified development, linear algebra tasks are abstracted across different underlying heterogeneous resources, including multicore CPUs, GPUs and Intel Xeon Phi coprocessors. Performance portability is guaranteed and this programming model is adapted to a wide range of accelerators, supporting both shared and distributed-memory environments. To solve the resilient challenges on large scale systems, fault tolerant mechanisms are designed for a task-based runtime to protect applications against both soft and hard errors. For soft errors, three additions to a task-based runtime are explored. The first recovers the application by re-executing minimum number of tasks, the second logs intermediary data between tasks to minimize the necessary re-execution, while the last one takes advantage of algorithmic properties to recover the data without re- execution. For hard errors, we propose two generic approaches, which augment the data logging mechanism for soft errors. The first utilizes non-volatile storage device to save logged data, while the second saves local logged data on a remote node to protect against node failure. Experimental results have confirmed that our soft and hard error fault tolerant mechanisms exhibit the expected correctness and efficiency
    corecore