50 research outputs found
NAS-X: Neural Adaptive Smoothing via Twisting
We present Neural Adaptive Smoothing via Twisting (NAS-X), a method for
learning and inference in sequential latent variable models based on reweighted
wake-sleep (RWS). NAS-X works with both discrete and continuous latent
variables, and leverages smoothing SMC to fit a broader range of models than
traditional RWS methods. We test NAS-X on discrete and continuous tasks and
find that it substantially outperforms previous variational and RWS-based
methods in inference and parameter recovery
Relation between stress heterogeneity and aftershock rate in the rate-and-state model
We estimate the rate of aftershocks triggered by a heterogeneous stress
change, using the rate-and-state model of Dieterich [1994].We show that an
exponential stress distribution Pt(au) ~exp(-tautau_0) gives an Omori law decay
of aftershocks with time ~1/t^p, with an exponent p=1-A sigma_n/tau_0, where A
is a parameter of the rate-and-state friction law, and \sigma_n the normal
stress. Omori exponent p thus decreases if the stress "heterogeneity" tau_0
decreases. We also invert the stress distribution P(tau) from the seismicity
rate R(t), assuming that the stress does not change with time. We apply this
method to a synthetic stress map, using the (modified) scale invariant "k^2"
slip model [Herrero and Bernard, 1994]. We generate synthetic aftershock
catalogs from this stress change.The seismicity rate on the rupture area shows
a huge increase at short times, even if the stress decreases on average.
Aftershocks are clustered in the regions of low slip, but the spatial
distribution is more diffuse than for a simple slip dislocation. Because the
stress field is very heterogeneous, there are many patches of positive stress
changes everywhere on the fault.This stochastic slip model gives a Gaussian
stress distribution, but nevertheless produces an aftershock rate which is very
close to Omori's law, with an effective p<=1, which increases slowly with time.
We obtain a good estimation of the stress distribution for realistic catalogs,
when we constrain the shape of the distribution. However, there are probably
other factors which also affect the temporal decay of aftershocks with time. In
particular, heterogeneity of A\sigma_n can also modify the parameters p and c
of Omori's law. Finally, we show that stress shadows are very difficult to
observe in a heterogeneous stress context.Comment: In press in JG
Learning Hard Alignments with Variational Inference
There has recently been significant interest in hard attention models for
tasks such as object recognition, visual captioning and speech recognition.
Hard attention can offer benefits over soft attention such as decreased
computational cost, but training hard attention models can be difficult because
of the discrete latent variables they introduce. Previous work used REINFORCE
and Q-learning to approach these issues, but those methods can provide
high-variance gradient estimates and be slow to train. In this paper, we tackle
the problem of learning hard attention for a sequential task using variational
inference methods, specifically the recently introduced VIMCO and NVIL.
Furthermore, we propose a novel baseline that adapts VIMCO to this setting. We
demonstrate our method on a phoneme recognition task in clean and noisy
environments and show that our method outperforms REINFORCE, with the
difference being greater for a more complicated task
The Neural Testbed: Evaluating Joint Predictions
Predictive distributions quantify uncertainties ignored by point estimates.
This paper introduces The Neural Testbed: an open-source benchmark for
controlled and principled evaluation of agents that generate such predictions.
Crucially, the testbed assesses agents not only on the quality of their
marginal predictions per input, but also on their joint predictions across many
inputs. We evaluate a range of agents using a simple neural network data
generating process. Our results indicate that some popular Bayesian deep
learning agents do not fare well with joint predictions, even when they can
produce accurate marginal predictions. We also show that the quality of joint
predictions drives performance in downstream decision tasks. We find these
results are robust across choice a wide range of generative models, and
highlight the practical importance of joint predictions to the community