20 research outputs found
Partition Functions from Rao-Blackwellized Tempered Sampling
Abstract Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability law of the inverse temperatures, and provides estimates of the partition function in terms of a simple quotient of Rao-Blackwellized marginal inverse temperature probability estimates, which are updated while sampling. We show that the method has interesting connections with several alternative popular methods, and offers some significant advantages. In particular, we empirically find that the new method provides more accurate estimates than Annealed Importance Sampling when calculating partition functions of large Restricted Boltzmann Machines (RBM); moreover, the method is sufficiently accurate to track training and validation log-likelihoods during learning of RBMs, at minimal computational cost
Recommended from our members
Generative Modeling and Inference in Directed and Undirected Neural Networks
Generative modeling and inference are two broad categories in unsupervised learning whose goal is to answer the following questions, respectively: 1. Given a dataset, how do we (either implicitly or explicitly) model the underlying probability distribution from which the data came and draw samples from that distribution? 2. How can we learn an underlying abstract representation of the data? In this dissertation we provide three studies that each in a different way improve upon specific generative modeling and inference techniques. First, we develop a state-of-the-art estimator of a generic probability distribution's partition function, or normalizing constant, during simulated tempering. We then apply our estimator to the specific case of training undirected probabilistic graphical models and find our method able to track log-likelihoods during training at essentially no extra computational cost. We then shift our focus to variational inference in directed probabilistic graphical models (Bayesian networks) for generative modeling and inference. First, we generalize the aggregate prior distribution to decouple the variational and generative models to provide the model with greater flexibility and find improvements in the model's log-likelihood of test data as well as a better latent representation. Finally, we study the variational loss function and argue under a typical architecture the data-dependent term of the gradient decays to zero as the latent space dimensionality increases. We use this result to propose a simple modification to random weight initialization and show in certain models the modification gives rise to substantial improvement in training convergence time. Together, these results improve quantitative performance of popular generative modeling and inference models in addition to furthering our understanding of them
Integrative species delimitation and taxonomic status of the scorpion genus \u3cem\u3eVaejovis\u3c/em\u3e Koch, 1836 (Vaejovidae) in the Santa Catalina Mountains, Arizona
Scorpions belonging to the Vaejovis vorhiesi species complex are widely distributed throughout the southwestern United States and northern Mexico. Most species are endemic to single mountain ranges but two species, Vaejovis deboerae Ayrey, 2009 and V. brysoni Ayrey & Webber, 2013, have been documented from the Santa Catalina Mountains in Arizona. We reevaluated the taxonomic diversity of these scorpions by integrating data from several different sources. Phylogenetic analyses indicate that scorpions in the Santa Catalina Mountains are monophyletic but comprise two divergent mitochondrial lineages that overlap at the type locality of V. deboerae. We failed to detect congruence between these lineages and the remaining datasets which suggests that there is a single species that we refer to as V. deboerae (=V. brysoni syn. nov.). Our inability to gather molecular data from the female holotype of V. deboerae could be the basis for future nomenclatural volatility if future studies find that the mitochondrial lineages are validated by other forms of data (e.g., male morphology). Results from this study underscore the importance of integrative methods for delimiting species in morphologically cryptic groups. Furthermore, we recommend generating DNA barcodes for holotypes as part of the description process to reduce future nomenclatural quagmires
Autonomous Probabilistic Coprocessing with Petaflips per Second
In this paper we present a concrete design for a probabilistic (p-) computer
based on a network of p-bits, robust classical entities fluctuating between -1
and +1, with probabilities that are controlled through an input constructed
from the outputs of other p-bits. The architecture of this probabilistic
computer is similar to a stochastic neural network with the p-bit playing the
role of a binary stochastic neuron, but with one key difference: there is no
sequencer used to enforce an ordering of p-bit updates, as is typically
required. Instead, we explore \textit{sequencerless} designs where all p-bits
are allowed to flip autonomously and demonstrate that such designs can allow
ultrafast operation unconstrained by available clock speeds without
compromising the solution's fidelity. Based on experimental results from a
hardware benchmark of the autonomous design and benchmarked device models, we
project that a nanomagnetic implementation can scale to achieve petaflips per
second with millions of neurons. A key contribution of this paper is the focus
on a hardware metric flips per second as a problem and
substrate-independent figure-of-merit for an emerging class of hardware
annealers known as Ising Machines. Much like the shrinking feature sizes of
transistors that have continually driven Moore's Law, we believe that flips per
second can be continually improved in later technology generations of a wide
class of probabilistic, domain specific hardware.Comment: 13 pages, 8 figures, 1 tabl