11 research outputs found
Sliced rotated sphere packing designs
Space-filling designs are popular choices for computer experiments. A sliced
design is a design that can be partitioned into several subdesigns. We propose
a new type of sliced space-filling design called sliced rotated sphere packing
designs. Their full designs and subdesigns are rotated sphere packing designs.
They are constructed by rescaling, rotating, translating and extracting the
points from a sliced lattice. We provide two fast algorithms to generate such
designs. Furthermore, we propose a strategy to use sliced rotated sphere
packing designs adaptively. Under this strategy, initial runs are uniformly
distributed in the design space, follow-up runs are added by incorporating
information gained from initial runs, and the combined design is space-filling
for any local region. Examples are given to illustrate its potential
application
Mixed-Variable Global Sensitivity Analysis For Knowledge Discovery And Efficient Combinatorial Materials Design
Global Sensitivity Analysis (GSA) is the study of the influence of any given
inputs on the outputs of a model. In the context of engineering design, GSA has
been widely used to understand both individual and collective contributions of
design variables on the design objectives. So far, global sensitivity studies
have often been limited to design spaces with only quantitative (numerical)
design variables. However, many engineering systems also contain, if not only,
qualitative (categorical) design variables in addition to quantitative design
variables. In this paper, we integrate Latent Variable Gaussian Process (LVGP)
with Sobol' analysis to develop the first metamodel-based mixed-variable GSA
method. Through numerical case studies, we validate and demonstrate the
effectiveness of our proposed method for mixed-variable problems. Furthermore,
while the proposed GSA method is general enough to benefit various engineering
design applications, we integrate it with multi-objective Bayesian optimization
(BO) to create a sensitivity-aware design framework in accelerating the Pareto
front design exploration for metal-organic framework (MOF) materials with
many-level combinatorial design spaces. Although MOFs are constructed only from
qualitative variables that are notoriously difficult to design, our method can
utilize sensitivity analysis to navigate the optimization in the many-level
large combinatorial design space, greatly expediting the exploration of novel
MOF candidates.Comment: 35 Pages, 10 Figures, 2 Table
Rapid Design of Top-Performing Metal-Organic Frameworks with Qualitative Representations of Building Blocks
Data-driven materials design often encounters challenges where systems
require or possess qualitative (categorical) information. Metal-organic
frameworks (MOFs) are an example of such material systems. The representation
of MOFs through different building blocks makes it a challenge for designers to
incorporate qualitative information into design optimization. Furthermore, the
large number of potential building blocks leads to a combinatorial challenge,
with millions of possible MOFs that could be explored through time consuming
physics-based approaches. In this work, we integrated Latent Variable Gaussian
Process (LVGP) and Multi-Objective Batch-Bayesian Optimization (MOBBO) to
identify top-performing MOFs adaptively, autonomously, and efficiently without
any human intervention. Our approach provides three main advantages: (i) no
specific physical descriptors are required and only building blocks that
construct the MOFs are used in global optimization through qualitative
representations, (ii) the method is application and property independent, and
(iii) the latent variable approach provides an interpretable model of
qualitative building blocks with physical justification. To demonstrate the
effectiveness of our method, we considered a design space with more than 47,000
MOF candidates. By searching only ~1% of the design space, LVGP-MOBBO was able
to identify all MOFs on the Pareto front and more than 97% of the 50
top-performing designs for the CO working capacity and CO/N
selectivity properties. Finally, we compared our approach with the Random
Forest algorithm and demonstrated its efficiency, interpretability, and
robustness.Comment: 35 pages total. First 29 pages belong to the main manuscript and the
remaining 6 six are for the supplementary information, 13 figures total. 9
figures are on the main manuscript and 4 figures are in the supplementary
information. 1 table in the supplementary informatio
Design of Experiments for Screening
The aim of this paper is to review methods of designing screening
experiments, ranging from designs originally developed for physical experiments
to those especially tailored to experiments on numerical models. The strengths
and weaknesses of the various designs for screening variables in numerical
models are discussed. First, classes of factorial designs for experiments to
estimate main effects and interactions through a linear statistical model are
described, specifically regular and nonregular fractional factorial designs,
supersaturated designs and systematic fractional replicate designs. Generic
issues of aliasing, bias and cancellation of factorial effects are discussed.
Second, group screening experiments are considered including factorial group
screening and sequential bifurcation. Third, random sampling plans are
discussed including Latin hypercube sampling and sampling plans to estimate
elementary effects. Fourth, a variety of modelling methods commonly employed
with screening designs are briefly described. Finally, a novel study
demonstrates six screening methods on two frequently-used exemplars, and their
performances are compared
Future proofing a building design using history matching inspired levelâset techniques
This is the final version. Available on open access from Wiley via the DOI in this record.âŻHow can one design a building that will be sufficiently protected against overheating and sufficiently energy efficient, whilst considering the expected increases in temperature due to climate change? We successfully manage to address this questionâgreatly reducing a large set of initial candidate building designs down to a small set of acceptable buildings. We do this using a complex computer model, statistical models of said computer model (emulators), and a modification to the history matching calibration technique. This modification tackles the problem of levelâset estimation (rather than calibration), where the goal is to find input settings which lead to the simulated output being below some threshold. The entire procedure allows us to present a practitioner with a set of acceptable building designs, with the final design chosen based on other requirements (subjective or otherwise).Engineering and Physical Sciences Research Council (EPSRC
EXTENDING AND IMPROVING DESIGNS FOR LARGE-SCALE COMPUTER EXPERIMENTS
This research develops methods that increase the inventory of space-filling designs (SFDs) for large-scale computer-based experiments. We present a technique enabling researchers to add sequential blocks of design points effectively and efficiently to existing SFDs. We accomplish this through a quadratically constrained mixed-integer program that augments cataloged or computationally expensive designs by optimally permuting and stacking columns of an initial base design to minimize the maximum absolute pairwise correlation among columns in the new extended design. We extend many classes of SFDs to dimensions that are currently not easily obtainable. Adding new design points provides more degrees of freedom for building metamodels and assessing fit. The resulting extended designs have better correlation and space-filling properties than the original base designs and compare well with other types of SFDs created from scratch in the extended design space. In addition, through massive computer-based experimentation, we compare popular software packages for generating SFDs and provide insight into the methods and relationships among design measures of correlation and space-fillingness. These results provide experimenters with a broad understanding of SFD software packages, algorithms, and optimality criteria. Further, we provide a probability-distribution model for the maximum absolute pairwise correlation among columns in the widely used maximin Latin hypercube designs.Lieutenant Colonel, United States Marine CorpsApproved for public release. Distribution is unlimited
Developing Efficient Strategies For Global Sensitivity Analysis Of Complex Environmental Systems Models
Complex Environmental Systems Models (CESMs) have been developed and applied as vital tools to tackle the ecological, water, food, and energy crises that humanity faces, and have been used widely to support decision-making about management of the quality and quantity of Earthâs resources. CESMs are often controlled by many interacting and uncertain parameters, and typically integrate data from multiple sources at different spatio-temporal scales, which make them highly complex. Global Sensitivity Analysis (GSA) techniques have proven to be promising for deepening our understanding of the model complexity and interactions between various parameters and providing helpful recommendations for further model development and data acquisition. Aside from the complexity issue, the computationally expensive nature of the CESMs precludes effective application of the existing GSA techniques in quantifying the global influence of each parameter on variability of the CESMsâ outputs. This is because a comprehensive sensitivity analysis often requires performing a very large number of model runs. Therefore, there is a need to break down this barrier by the development of more efficient strategies for sensitivity analysis.
The research undertaken in this dissertation is mainly focused on alleviating the computational burden associated with GSA of the computationally expensive CESMs through developing efficiency-increasing strategies for robust sensitivity analysis. This is accomplished by: (1) proposing an efficient sequential sampling strategy for robust sampling-based analysis of CESMs; (2) developing an automated parameter grouping strategy of high-dimensional CESMs, (3) introducing a new robustness measure for convergence assessment of the GSA methods; and (4) investigating time-saving strategies for handling simulation failures/crashes during the sensitivity analysis of computationally expensive CESMs.
This dissertation provides a set of innovative numerical techniques that can be used in conjunction with any GSA algorithm and be integrated in model building and systems analysis procedures in any field where models are used. A range of analytical test functions and environmental models with varying complexity and dimensionality are utilized across this research to test the performance of the proposed methods. These methods, which are embedded in the VARSâTOOL software package, can also provide information useful for diagnostic testing, parameter identifiability analysis, model simplification, model calibration, and experimental design. They can be further applied to address a range of decision making-related problems such as characterizing the main causes of risk in the context of probabilistic risk assessment and exploring the CESMsâ sensitivity to a wide range of plausible future changes (e.g., hydrometeorological conditions) in the context of scenario analysis
Mining for cosmological information: Simulation-based methods for Redshift Space Distortions and Galaxy Clustering
The standard model of cosmology describes the complex large scale structure of the Universe through less than 10 free parameters. However, concordance with observations requires that about 95\% of the energy content of the universe is invisible to us. Most of this energy is postulated to be in the form of a cosmological constant, , which drives the observed accelerated expansion of the Universe. Its nature is, however, unknown. This mystery forces cosmologists to look for inconsistencies between theory and data, searching for clues. But finding statistically significant contradictions requires extremely accurate measurements of the composition of the Universe, which are at present limited by our inability to extract all the information contained in the data, rather than being limited by the data itself. In this Thesis, we study how we can overcome these limitations by i) modelling how galaxies cluster on small scales with simulation-based methods, where perturbation theory fails to provide accurate predictions, and ii) developing summary statistics of the density field that are capable of extracting more information than the commonly used two-point functions. In the first half, we show how the real to redshift space mapping can be modelled accurately by going beyond the Gaussian approximation for the pairwise velocity distribution. We then show that simulation-based models can accurately predict the full shape of galaxy clustering in real space, increasing the constraining power on some of the cosmological parameters by a factor of 2 compared to perturbation theory methods. In the second half, we measure the information content of density dependent clustering. We show that it can improve the constraints on all cosmological parameters by factors between 3 and 8 over the two-point function. In particular, exploiting the environment dependence can constrain the mass of neutrinos by a factor of 8$ better than the two-point correlation function alone. We hope that the techniques described in this thesis will contribute to extracting all the cosmological information contained in ongoing and upcoming galaxy surveys, and provide insight into the nature of the accelerated expansion of the universe