Search CORE

20 research outputs found

Partition Functions from Rao-Blackwellized Tempered Sampling

Author: Ari Pakman
David E Carlson
Liam Paninski
Patrick Stinson
Publication venue
Publication date: 11/04/2020
Field of study

Abstract Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability law of the inverse temperatures, and provides estimates of the partition function in terms of a simple quotient of Rao-Blackwellized marginal inverse temperature probability estimates, which are updated while sampling. We show that the method has interesting connections with several alternative popular methods, and offers some significant advantages. In particular, we empirically find that the new method provides more accurate estimates than Annealed Importance Sampling when calculating partition functions of large Restricted Boltzmann Machines (RBM); moreover, the method is sufficiently accurate to track training and validation log-likelihoods during learning of RBMs, at minimal computational cost

CiteSeerX

Recommended from our members

Generative Modeling and Inference in Directed and Undirected Neural Networks

Author: Stinson Patrick
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Generative modeling and inference are two broad categories in unsupervised learning whose goal is to answer the following questions, respectively: 1. Given a dataset, how do we (either implicitly or explicitly) model the underlying probability distribution from which the data came and draw samples from that distribution? 2. How can we learn an underlying abstract representation of the data? In this dissertation we provide three studies that each in a different way improve upon specific generative modeling and inference techniques. First, we develop a state-of-the-art estimator of a generic probability distribution's partition function, or normalizing constant, during simulated tempering. We then apply our estimator to the specific case of training undirected probabilistic graphical models and find our method able to track log-likelihoods during training at essentially no extra computational cost. We then shift our focus to variational inference in directed probabilistic graphical models (Bayesian networks) for generative modeling and inference. First, we generalize the aggregate prior distribution to decouple the variational and generative models to provide the model with greater flexibility and find improvements in the model's log-likelihood of test data as well as a better latent representation. Finally, we study the variational loss function and argue under a typical architecture the data-dependent term of the gradient decays to zero as the latent space dimensionality increases. We use this result to propose a simple modification to random weight initialization and show in certain models the modification gives rise to substantial improvement in training convergence time. Together, these results improve quantitative performance of popular generative modeling and inference models in addition to furthering our understanding of them

Columbia University Academic Commons

Integrative species delimitation and taxonomic status of the scorpion genus \u3cem\u3eVaejovis\u3c/em\u3e Koch, 1836 (Vaejovidae) in the Santa Catalina Mountains, Arizona

Author: Broussard Lillian-Lee M.
Hendrixson Brent E.
Jochim Emma E.
Publication venue: Marshall Digital Scholar
Publication date: 12/08/2020
Field of study

Scorpions belonging to the Vaejovis vorhiesi species complex are widely distributed throughout the southwestern United States and northern Mexico. Most species are endemic to single mountain ranges but two species, Vaejovis deboerae Ayrey, 2009 and V. brysoni Ayrey & Webber, 2013, have been documented from the Santa Catalina Mountains in Arizona. We reevaluated the taxonomic diversity of these scorpions by integrating data from several different sources. Phylogenetic analyses indicate that scorpions in the Santa Catalina Mountains are monophyletic but comprise two divergent mitochondrial lineages that overlap at the type locality of V. deboerae. We failed to detect congruence between these lineages and the remaining datasets which suggests that there is a single species that we refer to as V. deboerae (=V. brysoni syn. nov.). Our inability to gather molecular data from the female holotype of V. deboerae could be the basis for future nomenclatural volatility if future studies find that the mitochondrial lineages are validated by other forms of data (e.g., male morphology). Results from this study underscore the importance of integrative methods for delimiting species in morphologically cryptic groups. Furthermore, we recommend generating DNA barcodes for holotypes as part of the description process to reduce future nomenclatural quagmires

Marshall University

Autonomous Probabilistic Coprocessing with Petaflips per Second

Author: Camsari Kerem Y.
Datta Supriyo
Faria Rafatul
Ghantasala Lakshmi A.
Jaiswal Risi
Sutton Brian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

In this paper we present a concrete design for a probabilistic (p-) computer based on a network of p-bits, robust classical entities fluctuating between -1 and +1, with probabilities that are controlled through an input constructed from the outputs of other p-bits. The architecture of this probabilistic computer is similar to a stochastic neural network with the p-bit playing the role of a binary stochastic neuron, but with one key difference: there is no sequencer used to enforce an ordering of p-bit updates, as is typically required. Instead, we explore \textit{sequencerless} designs where all p-bits are allowed to flip autonomously and demonstrate that such designs can allow ultrafast operation unconstrained by available clock speeds without compromising the solution's fidelity. Based on experimental results from a hardware benchmark of the autonomous design and benchmarked device models, we project that a nanomagnetic implementation can scale to achieve petaflips per second with millions of neurons. A key contribution of this paper is the focus on a hardware metric

-

flips per second

-

as a problem and substrate-independent figure-of-merit for an emerging class of hardware annealers known as Ising Machines. Much like the shrinking feature sizes of transistors that have continually driven Moore's Law, we believe that flips per second can be continually improved in later technology generations of a wide class of probabilistic, domain specific hardware.Comment: 13 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

eScholarship - University of California