67 research outputs found
Recommended from our members
Sampling Configurational Energy Landscapes
The computational analysis of high dimensional surfaces is a fundamental problem
across a wide range of scientific fields, for example in the study of models of clusters
of atoms, glasses, self-assembling systems, and biomolecules; machine learning;
physics; and other fields. This work presents a variety of novel methods developed
to aid the computational study of the structures, dynamics, and thermodynamics of
systems described by these surfaces, traditionally termed energy landscapes.
When studying molecular systems, it is important to be able to quantify measures
of similarity or difference between a pair of structures generated from an energy
landscape. These measures are needed to make predictions of the properties of a
given molecular structure from the known properties of similar others. Equivalently
a pair of structures can be aligned into similar orientations to allow an interpolated
pathway to be generated between them which can be used to identify the transition
states between the pair which is a key limiting step in discrete path sampling. The
efficiency of the transition state search is strongly dependent on the quality of the
initial interpolation and so the alignment methods used. In this work two novel
alignment algorithms are presented and benchmarked against existing algorithms
for aligning pairs of structures for both periodic and isolated clusters of atoms. The
algorithms respectively demonstrate superior performance for either periodic or
isolated structures.
The efficient evaluation of the global thermodynamic properties of an in silico
system, or analogously, the evidence in Bayesian inference, is a challenge for many
high-dimensional systems due to a phenomenon known as broken ergodicity. This
problem occurs when the energy barriers between different regions of the energy
landscape make it difficult to sample both uniformly. In this work a novel superposition based approach that is embarrassingly parallel, based on the athermal method
nested sampling, is introduced and benchmarked against a model system exhibiting
broken ergodicity. It is shown that the method reproduces the key features of the
heat capacityEPSRC Cambridge NanoDTC, EP/G037221/
Cosmology in the Presence of Non-Gaussianity
Modern observational cosmology relies on statistical inference, which models measurable quantities (including their systematic and statistical uncertainties) as random variates, examples are model parameters (`cosmological parameters') to be estimated via regression, as well as the observable data itself. In various contexts, these exhibit non-Gaussian distribution properties, e.g., the Bayesian joint posterior distribution of cosmological parameters from different data sets, or the random fields affected by late-time nonlinear structure formation like the convergence of weak gravitational lensing or the galaxy density contrast. Gaussianisation provides us with a powerful toolbox to model this non-Gaussian structure: a non-linear transformation from the original non-Gaussian random variate to an auxiliary random variate with (approximately) Gaussian distribution allows one to capture the full distribution structure in the first and second moments of the auxiliary. We consider parametric families of non-linear transformations, in particular Box-Cox transformations and generalisations thereof. We develop a framework that allows us to choose the optimally-Gaussianising transformation by optimising a loss function, and propose methods to assess the quality of the optimal transform a posteriori. First, we apply our maximum-likelihood framework to the posterior distribution of Planck data, and demonstrate how to reproduce the contours of credible regions without bias - our method significantly outperforms the current gold standard, kernel density estimation. Next, we use Gaussianisation to compute the model evidence for a combination of CFHTLenS and BOSS data, and compare to standard techniques. Third, we find Gaussianising transformations for simulated weak lensing convergence maps. This increases the information content accessible to two-point statistics (e.g., the power spectrum) and potentially allows for rapid production of independent mock maps with non-Gaussian correlation structure. With these examples, we demonstrate how Gaussianisation expands our current inference toolbox, and permits us to accurately extract information from non-Gaussian contexts
Exploring QCD matter in extreme conditions with Machine Learning
In recent years, machine learning has emerged as a powerful computational
tool and novel problem-solving perspective for physics, offering new avenues
for studying strongly interacting QCD matter properties under extreme
conditions. This review article aims to provide an overview of the current
state of this intersection of fields, focusing on the application of machine
learning to theoretical studies in high energy nuclear physics. It covers
diverse aspects, including heavy ion collisions, lattice field theory, and
neutron stars, and discuss how machine learning can be used to explore and
facilitate the physics goals of understanding QCD matter. The review also
provides a commonality overview from a methodology perspective, from
data-driven perspective to physics-driven perspective. We conclude by
discussing the challenges and future prospects of machine learning applications
in high energy nuclear physics, also underscoring the importance of
incorporating physics priors into the purely data-driven learning toolbox. This
review highlights the critical role of machine learning as a valuable
computational paradigm for advancing physics exploration in high energy nuclear
physics.Comment: 146 pages,53 figure
Out of equilibrium Statistical Physics of learning
In the study of hard optimization problems, it is often unfeasible to achieve
a full analytic control on the dynamics of the algorithmic processes that
find solutions efficiently. In many cases, a static approach is able to provide
considerable insight into the dynamical properties of these algorithms: in fact,
the geometrical structures found in the energetic landscape can strongly affect
the stationary states and the optimal configurations reached by the solvers.
In this context, a classical Statistical Mechanics approach, relying on the
assumption of the asymptotic realization of a Boltzmann Gibbs equilibrium,
can yield misleading predictions when the studied algorithms comprise some
stochastic components that effectively drive these processes out of equilibrium.
Thus, it becomes necessary to develop some intuition on the relevant features
of the studied phenomena and to build an ad hoc Large Deviation analysis,
providing a more targeted and richer description of the geometrical properties
of the landscape. The present thesis focuses on the study of learning processes
in Artificial Neural Networks, with the aim of introducing an out of equilibrium
statistical physics framework, based on the introduction of a local entropy
potential, for supporting and inspiring algorithmic improvements in the field
of Deep Learning, and for developing models of neural computation that can
carry both biological and engineering interest
Constraining the anisotropic expansion of the universe with type ia supernovae and improving the treatment of selection effects within bayesian hierarchical models
In thesis, I aim to apply advanced methods in Bayesian statistical modelling on Type Ia Supernovae (SNIa) data to determine tighter constraints on the fiducial Lambda-Cold-Dark-Matter (LCDM) cosmology and improve the modelling of systematic uncertainties in the data. The body of work covered herein can be broadly classified into two main topics:
I re-examine the contentious question of constraints on anisotropic expansion from SNIa in the light of a novel determination of peculiar velocities, which are crucial to test isotropy with SNe, out to distances < 200/h Mpc.The Bayesian hierarchical model BAHAMAS is adopted to constrain a dipole in the distance modulus in the context of the LCDM model and the deceleration parameter in a phenomenological Cosmographic expansion. I find no evidence for anisotropic expansion, and place a tight upper bound on the amplitude of a dipole, in a LCDM setting, and the Cosmographic expansion approach. Using Bayesian model comparison, I obtain posterior odds in excess of 900:1 (640:1) against a constant-in-redshift dipole for LCDM (Cosmographic expansion).
One of the modern problems of Supernovae cosmology is accounting for selection effects caused by Malmquist bias in a principled way. Here, I present a complete formalism for handling selection effects in Type Ia supernova (SNIa) cosmology in the context of Bayesian Hierarchical Modeling. I demonstrate the method on simulated data sets where selection cuts are made on the apparent magnitude and show that previous results by Rubin et al, (2015) are incorrect and can lead to biased cosmological parameters reconstruction. I how this formalism is easily extended to include the Phillips corrections that are used to standardize SNe. The formalism presented exhibits better statistical properties in terms of bias and mean squared error relative to a traditional ad hoc style correction and the model of Rubin et al, (2015)Open Acces
Novel sampling techniques for reservoir history matching optimisation and uncertainty quantification in flow prediction
Modern reservoir management has an increasing focus on accurately predicting the likely range of field recoveries. A variety of assisted history matching techniques has been developed across the research community concerned with this topic. These techniques are based on obtaining multiple models that closely reproduce the historical flow behaviour of a reservoir. The set of resulted history matched models is then used to quantify uncertainty in predicting the future performance of the reservoir and providing economic evaluations for different field development strategies. The key step in this workflow is to employ algorithms that sample the parameter space in an efficient but appropriate manner. The algorithm choice has an impact on how fast a model is obtained and how well the model fits the production data. The sampling techniques that have been developed to date include, among others, gradient based methods, evolutionary algorithms, and ensemble Kalman filter (EnKF).
This thesis has investigated and further developed the following sampling and inference techniques: Particle Swarm Optimisation (PSO), Hamiltonian Monte Carlo, and Population Markov Chain Monte Carlo. The inspected techniques have the capability of navigating the parameter space and producing history matched models that can be used to quantify the uncertainty in the forecasts in a faster and more reliable way. The analysis of these techniques, compared with Neighbourhood Algorithm (NA), has shown how the different techniques affect the predicted recovery from petroleum systems and the benefits of the developed methods over the NA.
The history matching problem is multi-objective in nature, with the production data possibly consisting of multiple types, coming from different wells, and collected at different times. Multiple objectives can be constructed from these data and explicitly be
optimised in the multi-objective scheme. The thesis has extended the PSO to handle multi-objective history matching problems in which a number of possible conflicting objectives must be satisfied simultaneously. The benefits and efficiency of innovative multi-objective particle swarm scheme (MOPSO) are demonstrated for synthetic reservoirs. It is demonstrated that the MOPSO procedure can provide a substantial improvement in finding a diverse set of good fitting models with a fewer number of very costly forward simulations runs than the standard single objective case, depending on how the objectives are constructed.
The thesis has also shown how to tackle a large number of unknown parameters through the coupling of high performance global optimisation algorithms, such as PSO, with model reduction techniques such as kernel principal component analysis (PCA), for parameterising spatially correlated random fields. The results of the PSO-PCA coupling applied to a recent SPE benchmark history matching problem have demonstrated that the approach is indeed applicable for practical problems. A comparison of PSO with the EnKF data assimilation method has been carried out and has concluded that both methods have obtained comparable results on the example case. This point reinforces the need for using a range of assisted history matching algorithms for more confidence in predictions
- …