141,632 research outputs found
Ensemble Transport Adaptive Importance Sampling
Markov chain Monte Carlo methods are a powerful and commonly used family of
numerical methods for sampling from complex probability distributions. As
applications of these methods increase in size and complexity, the need for
efficient methods increases. In this paper, we present a particle ensemble
algorithm. At each iteration, an importance sampling proposal distribution is
formed using an ensemble of particles. A stratified sample is taken from this
distribution and weighted under the posterior, a state-of-the-art ensemble
transport resampling method is then used to create an evenly weighted sample
ready for the next iteration. We demonstrate that this ensemble transport
adaptive importance sampling (ETAIS) method outperforms MCMC methods with
equivalent proposal distributions for low dimensional problems, and in fact
shows better than linear improvements in convergence rates with respect to the
number of ensemble members. We also introduce a new resampling strategy,
multinomial transformation (MT), which while not as accurate as the ensemble
transport resampler, is substantially less costly for large ensemble sizes, and
can then be used in conjunction with ETAIS for complex problems. We also focus
on how algorithmic parameters regarding the mixture proposal can be quickly
tuned to optimise performance. In particular, we demonstrate this methodology's
superior sampling for multimodal problems, such as those arising from inference
for mixture models, and for problems with expensive likelihoods requiring the
solution of a differential equation, for which speed-ups of orders of magnitude
are demonstrated. Likelihood evaluations of the ensemble could be computed in a
distributed manner, suggesting that this methodology is a good candidate for
parallel Bayesian computations
The Physics of the Colloidal Glass Transition
As one increases the concentration of a colloidal suspension, the system
exhibits a dramatic increase in viscosity. Structurally, the system resembles a
liquid, yet motions within the suspension are slow enough that it can be
considered essentially frozen. This kinetic arrest is the colloidal glass
transition. For several decades, colloids have served as a valuable model
system for understanding the glass transition in molecular systems. The spatial
and temporal scales involved allow these systems to be studied by a wide
variety of experimental techniques. The focus of this review is the current
state of understanding of the colloidal glass transition. A brief introduction
is given to important experimental techniques used to study the glass
transition in colloids. We describe features of colloidal systems near and in
glassy states, including tremendous increases in viscosity and relaxation
times, dynamical heterogeneity, and ageing, among others. We also compare and
contrast the glass transition in colloids to that in molecular liquids. Other
glassy systems are briefly discussed, as well as recently developed synthesis
techniques that will keep these systems rich with interesting physics for years
to come.Comment: 56 pages, 18 figures, Revie
Parameter Tuning Using Gaussian Processes
Most machine learning algorithms require us to set up their parameter values before applying these algorithms to solve problems. Appropriate parameter settings will bring good performance while inappropriate parameter settings generally result in poor modelling. Hence, it is necessary to acquire the “best” parameter values for a particular algorithm before building the model. The “best” model not only reflects the “real” function and is well fitted to existing points, but also gives good performance when making predictions for new points with previously unseen values.
A number of methods exist that have been proposed to optimize parameter values. The basic idea of all such methods is a trial-and-error process whereas the work presented in this thesis employs Gaussian process (GP) regression to optimize the parameter values of a given machine learning algorithm. In this thesis, we consider the optimization of only two-parameter learning algorithms. All the possible parameter values are specified in a 2-dimensional grid in this work. To avoid brute-force search, Gaussian Process Optimization (GPO) makes use of “expected improvement” to pick useful points rather than validating every point of the grid step by step. The point with the highest expected improvement is evaluated using cross-validation and the resulting data point is added to the training set for the Gaussian process model. This process is repeated until a stopping criterion is met. The final model is built using the learning algorithm based on the best parameter values identified in this process.
In order to test the effectiveness of this optimization method on regression and classification problems, we use it to optimize parameters of some well-known machine learning algorithms, such as decision tree learning, support vector machines and boosting with trees. Through the analysis of experimental results obtained on datasets from the UCI repository, we find that the GPO algorithm yields competitive performance compared with a brute-force approach, while exhibiting a distinct advantage in terms of training time and number of cross-validation runs. Overall, the GPO method is a promising method for the optimization of parameter values in machine learning
Revealing modified gravity signal in matter and halo hierarchical clustering
We use a set of N-body simulations employing a modified gravity (MG) model
with Vainshtein screening to study matter and halo hierarchical clustering. As
test-case scenarios we consider two normal branch Dvali-Gabadadze-Porrati
(nDGP) gravity models with mild and strong growth rate enhancement. We study
higher-order correlation functions up to and associated
hierarchical amplitudes . We find that
the matter PDFs are strongly affected by the fifth-force on scales up to
Mpc, and the deviations from GR are maximised at . For reduced
cumulants , we find that at small scales Mpc the MG is
characterised by lower values, with the deviation growing from in the
reduced skewness up to even in . To study the halo clustering we
use a simple abundance matching and divide haloes into thee fixed number
density samples. The halo two-point functions are weakly affected, with a
relative boost of the order of a few percent appearing only at the smallest
pair separations (Mpc). In contrast, we find a strong MG signal
in 's, which are enhanced compared to GR. The strong model exhibits a
level signal at various scales for all halo samples and in all
cumulants. In this context, we find that the reduced kurtosis to be an
especially promising cosmological probe of MG. Even the mild nDGP model leaves
a imprint at small scales Mpc, while the stronger model
deviates from a GR-signature at nearly all scales with a significance of
. Since the signal is persistent in all halo samples and over a range
of scales, we advocate that the reduced kurtosis estimated from galaxy
catalogues can potentially constitute a strong MG-model discriminatory as well
as GR self-consistency test.Comment: 19 pages, 11 figures, comments are welcom
Space Warps: I. Crowd-sourcing the Discovery of Gravitational Lenses
We describe Space Warps, a novel gravitational lens discovery service that
yields samples of high purity and completeness through crowd-sourced visual
inspection. Carefully produced colour composite images are displayed to
volunteers via a web- based classification interface, which records their
estimates of the positions of candidate lensed features. Images of simulated
lenses, as well as real images which lack lenses, are inserted into the image
stream at random intervals; this training set is used to give the volunteers
instantaneous feedback on their performance, as well as to calibrate a model of
the system that provides dynamical updates to the probability that a classified
image contains a lens. Low probability systems are retired from the site
periodically, concentrating the sample towards a set of lens candidates. Having
divided 160 square degrees of Canada-France-Hawaii Telescope Legacy Survey
(CFHTLS) imaging into some 430,000 overlapping 82 by 82 arcsecond tiles and
displaying them on the site, we were joined by around 37,000 volunteers who
contributed 11 million image classifications over the course of 8 months. This
Stage 1 search reduced the sample to 3381 images containing candidates; these
were then refined in Stage 2 to yield a sample that we expect to be over 90%
complete and 30% pure, based on our analysis of the volunteers performance on
training images. We comment on the scalability of the SpaceWarps system to the
wide field survey era, based on our projection that searches of 10 images
could be performed by a crowd of 10 volunteers in 6 days.Comment: 21 pages, 13 figures, MNRAS accepted, minor to moderate changes in
this versio
Accelerated Parameter Estimation with DALE
We consider methods for improving the estimation of constraints on a
high-dimensional parameter space with a computationally expensive likelihood
function. In such cases Markov chain Monte Carlo (MCMC) can take a long time to
converge and concentrates on finding the maxima rather than the often-desired
confidence contours for accurate error estimation. We employ DALE (Direct
Analysis of Limits via the Exterior of ) for determining confidence
contours by minimizing a cost function parametrized to incentivize points in
parameter space which are both on the confidence limit and far from previously
sampled points. We compare DALE to the nested sampling algorithm
implemented in MultiNest on a toy likelihood function that is highly
non-Gaussian and non-linear in the mapping between parameter values and
. We find that in high-dimensional cases DALE finds the same
confidence limit as MultiNest using roughly an order of magnitude fewer
evaluations of the likelihood function. DALE is open-source and available
at https://github.com/danielsf/Dalex.git
- …