2,917 research outputs found
Efficient Methods for Unsupervised Learning of Probabilistic Models
In this thesis I develop a variety of techniques to train, evaluate, and
sample from intractable and high dimensional probabilistic models. Abstract
exceeds arXiv space limitations -- see PDF
Mixing times of lozenge tiling and card shuffling Markov chains
We show how to combine Fourier analysis with coupling arguments to bound the
mixing times of a variety of Markov chains. The mixing time is the number of
steps a Markov chain takes to approach its equilibrium distribution. One
application is to a class of Markov chains introduced by Luby, Randall, and
Sinclair to generate random tilings of regions by lozenges. For an L X L region
we bound the mixing time by O(L^4 log L), which improves on the previous bound
of O(L^7), and we show the new bound to be essentially tight. In another
application we resolve a few questions raised by Diaconis and Saloff-Coste, by
lower bounding the mixing time of various card-shuffling Markov chains. Our
lower bounds are within a constant factor of their upper bounds. When we use
our methods to modify a path-coupling analysis of Bubley and Dyer, we obtain an
O(n^3 log n) upper bound on the mixing time of the Karzanov-Khachiyan Markov
chain for linear extensions.Comment: 39 pages, 8 figure
Separating Gravitational Wave Signals from Instrument Artifacts
Central to the gravitational wave detection problem is the challenge of
separating features in the data produced by astrophysical sources from features
produced by the detector. Matched filtering provides an optimal solution for
Gaussian noise, but in practice, transient noise excursions or ``glitches''
complicate the analysis. Detector diagnostics and coincidence tests can be used
to veto many glitches which may otherwise be misinterpreted as gravitational
wave signals. The glitches that remain can lead to long tails in the matched
filter search statistics and drive up the detection threshold. Here we describe
a Bayesian approach that incorporates a more realistic model for the instrument
noise allowing for fluctuating noise levels that vary independently across
frequency bands, and deterministic ``glitch fitting'' using wavelets as
``glitch templates'', the number of which is determined by a trans-dimensional
Markov chain Monte Carlo algorithm. We demonstrate the method's effectiveness
on simulated data containing low amplitude gravitational wave signals from
inspiraling binary black hole systems, and simulated non-stationary and
non-Gaussian noise comprised of a Gaussian component with the standard
LIGO/Virgo spectrum, and injected glitches of various amplitude, prevalence,
and variety. Glitch fitting allows us to detect significantly weaker signals
than standard techniques.Comment: 21 pages, 18 figure
Randomised algorithms for counting and generating combinatorial structures
SIGLEAvailable from British Library Document Supply Centre- DSC:D85048 / BLDSC - British Library Document Supply CentreGBUnited Kingdo
Recommended from our members
Exploring Probability Measures with Markov Processes
In many domains where mathematical modelling is applied, a deterministic description of the system at hand is insufficient, and so it is useful to model systems as being in some way stochastic. This is often achieved by modeling the state of the system as being drawn from a probability measure, which is usually given algebraically, i.e. as a formula. While this representation can be useful for deriving certain characteristics of the system, it is by now well-appreciated that many questions about stochastic systems are best-answered by looking at samples from the associated probability measure. In this thesis, we seek to develop and analyse efficient techniques for generating samples from a given probability measure, with a focus on algorithms which simulate a Markov process with the desired invariant measure.
The first work presented in this thesis considers the use of Piecewise-Deterministic Markov Processes (PDMPs) for generating samples. In contrast to usual approaches, PDMPs are i) defined as continuous-time processes, and ii) are typically non-reversible with respect to their invariant measure. These distinctions pose computational and theoretical challenges for the design, analysis, and implementation of PDMP-based samplers. The key contribution of this work is to develop a transparent characterisation of how one can construct a PDMP (within the class of trajectorially-reversible processes) which admits the desired invariant measure, and to offer actionable recommendations on how these processes should be designed in practice.
The second work presented in this thesis considers the task of sampling from a probability measure on a discrete space. While work in recent years has made it possible to apply sampling algorithms to probability measures with differentiable densities on continuous spaces in a reasonably generic way, samplers on discrete spaces are still largely derived on a case-by-case basis. The contention of this work is that this is not necessary, and that one can in fact define quite generally-applicable algorithms which can sample efficiently from discrete probability measures. The contributions are then to propose a small collection of algorithms for this task, and verify their efficiency empirically. Building
on the previous chapter’s work, our samplers are again defined in continuous time and non-reversible, each of which offer noticeable benefits in efficiency.
The third work presented in this thesis concerns a theoretical study of a particular class of Markov Chain-based sampling algorithms which make use of parallel computing resources. The Markov Chains which are produced by this algorithm are mathematically equivalent to a standard Metropolis-Hastings chain, but their real-time convergence properties are affected nontrivially by the application of parallelism. The contribution of this work is to analyse the convergence behaviour of these chains, and to use the ‘optimal scaling’ framework (as developed by Roberts, Rosenthal, and others) to make recommendations concerning the tuning of such algorithms in practice.
The introductory chapters provide a general overview on the task of generating samples from a probability measure, with particular focus on methods involving Markov processes. There is also an interlude on the relative benefits of i) continuous-time and ii) non-reversible Markov processes for sampling, which are intended to provide additional context for the reading of the first two works.PhD Studentship paid for by Cantab Capital Institute for the Mathematics of Informatio
Multilevel Hierarchical Decomposition of Finite Element White Noise with Application to Multilevel Markov Chain Monte Carlo
In this work we develop a new hierarchical multilevel approach to generate
Gaussian random field realizations in an algorithmically scalable manner that
is well-suited to incorporate into multilevel Markov chain Monte Carlo (MCMC)
algorithms. This approach builds off of other partial differential equation
(PDE) approaches for generating Gaussian random field realizations; in
particular, a single field realization may be formed by solving a
reaction-diffusion PDE with a spatial white noise source function as the
righthand side. While these approaches have been explored to accelerate forward
uncertainty quantification tasks, e.g. multilevel Monte Carlo, the previous
constructions are not directly applicable to multilevel MCMC frameworks which
build fine scale random fields in a hierarchical fashion from coarse scale
random fields. Our new hierarchical multilevel method relies on a hierarchical
decomposition of the white noise source function in which allows us to
form Gaussian random field realizations across multiple levels of
discretization in a way that fits into multilevel MCMC algorithmic frameworks.
After presenting our main theoretical results and numerical scaling results to
showcase the utility of this new hierarchical PDE method for generating
Gaussian random field realizations, this method is tested on a four-level MCMC
algorithm to explore its feasibility
Methods for Reconstructing Networks with Incomplete Information.
Network representations of complex systems are widespread and reconstructing unknown networks from data has been intensively researched in statistical and scientific communities more broadly. Two challenges in network reconstruction problems include having insufficient data to illuminate the full structure of the network and needing to combine information from different data sources. Addressing these challenges, this thesis contributes methodology for network reconstruction in three respects.
First, we consider sequentially choosing interventions to discover structure in directed networks focusing on learning a partial order over the nodes. This focus leads to a new model for intervention data under which nodal variables depend on the lengths of paths separating them from intervention targets rather than on parent sets. Taking a Bayesian approach, we present partial-order based priors and develop a novel Markov-Chain Monte Carlo (MCMC) method for computing posterior expectations over directed acyclic graphs. The utility of the MCMC approach comes from designing new proposals for the Metropolis algorithm that move locally among partial orders while independently sampling graphs from each partial order. The resulting Markov Chains mix rapidly and are ergodic. We also adapt an existing strategy for active structure learning, develop an efficient Monte Carlo procedure for estimating the resulting decision function, and evaluate the proposed methods numerically using simulations and benchmark datasets.
We next study penalized likelihood methods using incomplete order information as arising from intervention data. To make the notion of incomplete information precise, we introduce and formally define incomplete partial orders which subsumes the important special case of a known total ordering of the nodes. This special case lies along an information lattice and we study the reconstruction performance of penalized likelihood methods at different points along this lattice.
Finally, we present a method for ranking a network's potential edges using time-course data. The novelty is our development of a nonparametric gradient-matching procedure and a related summary statistic for measuring the strength of relationships among components in dynamic systems. Simulation studies demonstrate that given sufficient signal moving using this procedure to move from linear to additive approximations leads to improved rankings of potential edges.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113316/1/jbhender_1.pd
- …