804 research outputs found
Non-parametric Bayesian modeling of complex networks
Modeling structure in complex networks using Bayesian non-parametrics makes
it possible to specify flexible model structures and infer the adequate model
complexity from the observed data. This paper provides a gentle introduction to
non-parametric Bayesian modeling of complex networks: Using an infinite mixture
model as running example we go through the steps of deriving the model as an
infinite limit of a finite parametric model, inferring the model parameters by
Markov chain Monte Carlo, and checking the model's fit and predictive
performance. We explain how advanced non-parametric models for complex networks
can be derived and point out relevant literature
Bayesian Dropout
Dropout has recently emerged as a powerful and simple method for training
neural networks preventing co-adaptation by stochastically omitting neurons.
Dropout is currently not grounded in explicit modelling assumptions which so
far has precluded its adoption in Bayesian modelling. Using Bayesian entropic
reasoning we show that dropout can be interpreted as optimal inference under
constraints. We demonstrate this on an analytically tractable regression model
providing a Bayesian interpretation of its mechanism for regularizing and
preventing co-adaptation as well as its connection to other Bayesian
techniques. We also discuss two general approximate techniques for applying
Bayesian dropout for general models, one based on an analytical approximation
and the other on stochastic variational techniques. These techniques are then
applied to a Baysian logistic regression problem and are shown to improve
performance as the model become more misspecified. Our framework roots dropout
as a theoretically justified and practical tool for statistical modelling
allowing Bayesians to tap into the benefits of dropout training.Comment: 21 pages, 3 figures. Manuscript prepared 2014 and awaiting submissio
The Infinite Degree Corrected Stochastic Block Model
In Stochastic blockmodels, which are among the most prominent statistical
models for cluster analysis of complex networks, clusters are defined as groups
of nodes with statistically similar link probabilities within and between
groups. A recent extension by Karrer and Newman incorporates a node degree
correction to model degree heterogeneity within each group. Although this
demonstrably leads to better performance on several networks it is not obvious
whether modelling node degree is always appropriate or necessary. We formulate
the degree corrected stochastic blockmodel as a non-parametric Bayesian model,
incorporating a parameter to control the amount of degree correction which can
then be inferred from data. Additionally, our formulation yields principled
ways of inferring the number of groups as well as predicting missing links in
the network which can be used to quantify the model's predictive performance.
On synthetic data we demonstrate that including the degree correction yields
better performance both on recovering the true group structure and predicting
missing links when degree heterogeneity is present, whereas performance is on
par for data with no degree heterogeneity within clusters. On seven real
networks (with no ground truth group structure available) we show that
predictive performance is about equal whether or not degree correction is
included; however, for some networks significantly fewer clusters are
discovered when correcting for degree indicating that the data can be more
compactly explained by clusters of heterogenous degree nodes.Comment: Originally presented at the Complex Networks workshop NIPS 201
Constraints on the relative sizes of intervening Mg II-absorbing clouds and quasar emitting regions
Context: A significantly higher incidence of strong (rest equivalent width
W_r > 1 {\AA}) intervening Mg II absorption is observed along gamma-ray burst
(GRB) sight-lines relative to those of quasar sight-lines. A geometrical
explanation for this discrepancy has been suggested: the ratio of the beam size
of the source to the characteristic size of a Mg II absorption system can
influence the observed Mg II equivalent width, if these two sizes are
comparable. Aims: We investigate whether the differing beam sizes of the
continuum source and broad-line region of Sloan Digital Sky Survey (SDSS)
quasars produce a discrepancy between the incidence of strong Mg II absorbers
illuminated by the quasar continuum region and those of absorbers illuminated
by both continuum and broad-line region light. Methods: We perform a
semi-automated search for strong Mg II absorbers in the SDSS Data Release 7
quasar sample. The resulting strong Mg II absorber catalog is available online.
We measure the sight-line number density of strong Mg II absorbers superimposed
on and off the quasar C IV 1550 {\AA} and C III] 1909 {\AA} emission lines.
Results: We see no difference in the sight-line number density of strong Mg II
absorbers superimposed on quasar broad emission lines compared to those
superimposed on continuum-dominated spectral regions. This suggests that the Mg
II-absorbing clouds typically observed as intervening absorbers in quasar
spectra are larger than the beam sizes of both the continuum-emitting regions
and broad line-emitting regions in the centers of quasars, corresponding to a
lower limit of the order of 10^17} cm for the characteristic size of a Mg II
absorbing cloud.Comment: 10 pages, 5 figures. Edit: fixed a missing cross-referenc
- …