11,555 research outputs found
Applications of Neural Networks in Hadron Physics
The Bayesian approach for the feed-forward neural networks is reviewed. Its
potential for usage in hadron physics is discussed. As an example of the
application the study of the the two-photon exchange effect is presented. We
focus on the model comparison, the estimation of the systematic uncertainties
due to the choice of the model, and the over-fitting. As an illustration the
predictions of the cross sections ratio are given together with the estimate of the uncertainty due to the
parametrization choice.Comment: 16 pages, 9 figures, Invited contribution to the Journal of Physics
G: Nuclear and Particle Physics focus section entitled "Enhancing the
interaction between nuclear experiment and theory through information and
statistics", in pres
Proximity Variational Inference
Variational inference is a powerful approach for approximate posterior
inference. However, it is sensitive to initialization and can be subject to
poor local optima. In this paper, we develop proximity variational inference
(PVI). PVI is a new method for optimizing the variational objective that
constrains subsequent iterates of the variational parameters to robustify the
optimization path. Consequently, PVI is less sensitive to initialization and
optimization quirks and finds better local optima. We demonstrate our method on
three proximity statistics. We study PVI on a Bernoulli factor model and
sigmoid belief network with both real and synthetic data and compare to
deterministic annealing (Katahira et al., 2008). We highlight the flexibility
of PVI by designing a proximity statistic for Bayesian deep learning models
such as the variational autoencoder (Kingma and Welling, 2014; Rezende et al.,
2014). Empirically, we show that PVI consistently finds better local optima and
gives better predictive performance
Single-trial estimation of stimulus and spike-history effects on time-varying ensemble spiking activity of multiple neurons: a simulation study
Neurons in cortical circuits exhibit coordinated spiking activity, and can
produce correlated synchronous spikes during behavior and cognition. We
recently developed a method for estimating the dynamics of correlated ensemble
activity by combining a model of simultaneous neuronal interactions (e.g., a
spin-glass model) with a state-space method (Shimazaki et al. 2012 PLoS Comput
Biol 8 e1002385). This method allows us to estimate stimulus-evoked dynamics of
neuronal interactions which is reproducible in repeated trials under identical
experimental conditions. However, the method may not be suitable for detecting
stimulus responses if the neuronal dynamics exhibits significant variability
across trials. In addition, the previous model does not include effects of past
spiking activity of the neurons on the current state of ensemble activity. In
this study, we develop a parametric method for simultaneously estimating the
stimulus and spike-history effects on the ensemble activity from single-trial
data even if the neurons exhibit dynamics that is largely unrelated to these
effects. For this goal, we model ensemble neuronal activity as a latent process
and include the stimulus and spike-history effects as exogenous inputs to the
latent process. We develop an expectation-maximization algorithm that
simultaneously achieves estimation of the latent process, stimulus responses,
and spike-history effects. The proposed method is useful to analyze an
interaction of internal cortical states and sensory evoked activity.Comment: 12 pages, 3 figure
Mix-nets: Factored Mixtures of Gaussians in Bayesian Networks With Mixed Continuous And Discrete Variables
Recently developed techniques have made it possible to quickly learn accurate
probability density functions from data in low-dimensional continuous space. In
particular, mixtures of Gaussians can be fitted to data very quickly using an
accelerated EM algorithm that employs multiresolution kd-trees (Moore, 1999).
In this paper, we propose a kind of Bayesian networks in which low-dimensional
mixtures of Gaussians over different subsets of the domain's variables are
combined into a coherent joint probability model over the entire domain. The
network is also capable of modeling complex dependencies between discrete
variables and continuous variables without requiring discretization of the
continuous variables. We present efficient heuristic algorithms for
automatically learning these networks from data, and perform comparative
experiments illustrated how well these networks model real scientific data and
synthetic data. We also briefly discuss some possible improvements to the
networks, as well as possible applications.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in
Artificial Intelligence (UAI2000
Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs
Building on the success of deep learning, two modern approaches to learn a
probability model from the data are Generative Adversarial Networks (GANs) and
Variational AutoEncoders (VAEs). VAEs consider an explicit probability model
for the data and compute a generative distribution by maximizing a variational
lower-bound on the log-likelihood function. GANs, however, compute a generative
model by minimizing a distance between observed and generated probability
distributions without considering an explicit model for the observed data. The
lack of having explicit probability models in GANs prohibits computation of
sample likelihoods in their frameworks and limits their use in statistical
inference problems. In this work, we resolve this issue by constructing an
explicit probability model that can be used to compute sample likelihood
statistics in GANs. In particular, we prove that under this probability model,
a family of Wasserstein GANs with an entropy regularization can be viewed as a
generative model that maximizes a variational lower-bound on average sample log
likelihoods, an approach that VAEs are based on. This result makes a principled
connection between two modern generative models, namely GANs and VAEs. In
addition to the aforementioned theoretical results, we compute likelihood
statistics for GANs trained on Gaussian, MNIST, SVHN, CIFAR-10 and LSUN
datasets. Our numerical results validate the proposed theory
Gradient Estimation Using Stochastic Computation Graphs
In a variety of problems originating in supervised, unsupervised, and
reinforcement learning, the loss function is defined by an expectation over a
collection of random variables, which might be part of a probabilistic model or
the external world. Estimating the gradient of this loss function, using
samples, lies at the core of gradient-based learning algorithms for these
problems. We introduce the formalism of stochastic computation
graphs---directed acyclic graphs that include both deterministic functions and
conditional probability distributions---and describe how to easily and
automatically derive an unbiased estimator of the loss function's gradient. The
resulting algorithm for computing the gradient estimator is a simple
modification of the standard backpropagation algorithm. The generic scheme we
propose unifies estimators derived in variety of prior work, along with
variance-reduction techniques therein. It could assist researchers in
developing intricate models involving a combination of stochastic and
deterministic operations, enabling, for example, attention, memory, and control
actions.Comment: Advances in Neural Information Processing Systems 28 (NIPS 2015
A Nonlinear Spectral Method for Core--Periphery Detection in Networks
We derive and analyse a new iterative algorithm for detecting network
core--periphery structure. Using techniques in nonlinear Perron-Frobenius
theory, we prove global convergence to the unique solution of a relaxed version
of a natural discrete optimization problem. On sparse networks, the cost of
each iteration scales linearly with the number of nodes, making the algorithm
feasible for large-scale problems. We give an alternative interpretation of the
algorithm from the perspective of maximum likelihood reordering of a new
logistic core--periphery random graph model. This viewpoint also gives a new
basis for quantitatively judging a core--periphery detection algorithm. We
illustrate the algorithm on a range of synthetic and real networks, and show
that it offers advantages over the current state-of-the-art
Topological Bayesian Optimization with Persistence Diagrams
Finding an optimal parameter of a black-box function is important for
searching stable material structures and finding optimal neural network
structures, and Bayesian optimization algorithms are widely used for the
purpose. However, most of existing Bayesian optimization algorithms can only
handle vector data and cannot handle complex structured data. In this paper, we
propose the topological Bayesian optimization, which can efficiently find an
optimal solution from structured data using \emph{topological information}.
More specifically, in order to apply Bayesian optimization to structured data,
we extract useful topological information from a structure and measure the
proper similarity between structures. To this end, we utilize persistent
homology, which is a topological data analysis method that was recently applied
in machine learning. Moreover, we propose the Bayesian optimization algorithm
that can handle multiple types of topological information by using a linear
combination of kernels for persistence diagrams. Through experiments, we show
that topological information extracted by persistent homology contributes to a
more efficient search for optimal structures compared to the random search
baseline and the graph Bayesian optimization algorithm
Graph Embedding with Rich Information through Heterogeneous Network
Graph embedding has attracted increasing attention due to its critical
application in social network analysis. Most existing algorithms for graph
embedding only rely on the typology information and fail to use the copious
information in nodes as well as edges. As a result, their performance for many
tasks may not be satisfactory. In this paper, we proposed a novel and general
framework of representation learning for graph with rich text information
through constructing a bipartite heterogeneous network. Specially, we designed
a biased random walk to explore the constructed heterogeneous network with the
notion of flexible neighborhood. The efficacy of our method is demonstrated by
extensive comparison experiments with several baselines on various datasets. It
improves the Micro-F1 and Macro-F1 of node classification by 10% and 7% on Cora
dataset.Comment: 9 pages, 7 figures, 4 table
A Novel Experimental Platform for In-Vessel Multi-Chemical Molecular Communications
This work presents a new multi-chemical experimental platform for molecular
communication where the transmitter can release different chemicals. This
platform is designed to be inexpensive and accessible, and it can be expanded
to simulate different environments including the cardiovascular system and
complex network of pipes in industrial complexes and city infrastructures. To
demonstrate the capabilities of the platform, we implement a time-slotted
binary communication system where a bit-0 is represented by an acid pulse, a
bit-1 by a base pulse, and information is carried via pH signals. The channel
model for this system, which is nonlinear and has long memories, is unknown.
Therefore, we devise novel detection algorithms that use techniques from
machine learning and deep learning to train a maximum-likelihood detector.
Using these algorithms the bit error rate improves by an order of magnitude
relative to the approach used in previous works. Moreover, our system achieves
a data rate that is an order of magnitude higher than any of the previous
molecular communication platforms
- β¦