225,239 research outputs found
Automatic Differentiation Variational Inference
Probabilistic modeling is iterative. A scientist posits a simple model, fits
it to her data, refines it according to her analysis, and repeats. However,
fitting complex models to large data is a bottleneck in this process. Deriving
algorithms for new models can be both mathematically and computationally
challenging, which makes it difficult to efficiently cycle through the steps.
To this end, we develop automatic differentiation variational inference (ADVI).
Using our method, the scientist only provides a probabilistic model and a
dataset, nothing else. ADVI automatically derives an efficient variational
inference algorithm, freeing the scientist to refine and explore many models.
ADVI supports a broad class of models-no conjugacy assumptions are required. We
study ADVI across ten different models and apply it to a dataset with millions
of observations. ADVI is integrated into Stan, a probabilistic programming
system; it is available for immediate use
Automatic Variational Inference in Stan
Variational inference is a scalable technique for approximate Bayesian
inference. Deriving variational inference algorithms requires tedious
model-specific calculations; this makes it difficult to automate. We propose an
automatic variational inference algorithm, automatic differentiation
variational inference (ADVI). The user only provides a Bayesian model and a
dataset; nothing else. We make no conjugacy assumptions and support a broad
class of models. The algorithm automatically determines an appropriate
variational family and optimizes the variational objective. We implement ADVI
in Stan (code available now), a probabilistic programming framework. We compare
ADVI to MCMC sampling across hierarchical generalized linear models,
nonconjugate matrix factorization, and a mixture model. We train the mixture
model on a quarter million images. With ADVI we can use variational inference
on any model we write in Stan
Halorubrum chaoviator sp. nov., a haloarchaeon isolated from sea salt in Baja California, Mexico, Western Australia and Naxos, Greece
hree halophilic isolates, strains Halo-G*T, AUS-1 and Naxos II, were compared. Halo-G* was isolated from an evaporitic salt crystal from Baja California, Mexico, whereas AUS-1 and Naxos II were isolated from salt pools in Western Australia and the Greek island of Naxos, respectively. Halo-G*T had been exposed previously to conditions of outer space and survived 2 weeks on the Biopan facility. Chemotaxonomic and molecular comparisons suggested high similarity between the three strains. Phylogenetic analysis based on the 16S rRNA gene sequences revealed that the strains clustered with Halorubrum species, showing sequence similarities of 99.2–97.1 %. The DNA–DNA hybridization values of strain Halo-G*T and strains AUS-1 and Naxos II are 73 and 75 %, respectively, indicating that they constitute a single species. The DNA relatedness between strain Halo-G*T and the type strains of 13 closely related species of the genus Halorubrum ranged from 39 to 2 %, suggesting that the three isolates constitute a different genospecies. The G+C content of the DNA of the three strains was 65.5–66.5 mol%. All three strains contained C20C20 derivatives of diethers of phosphatidylglycerol, phosphatidylglyceromethylphosphate and phosphatidylglycerolsulfate, together with a sulfated glycolipid. On the basis of these results, a novel species that includes the three strains is proposed, with the name Halorubrum chaoviator sp. nov. The type strain is strain Halo-G*T (=DSM 19316T =NCIMB 14426T =ATCC BAA-1602T)
bayesvl: Visually Learning the Graphical Structure of Bayesian Networks and Performing MCMC with 'Stan'
The 'bayesvl' R Packag
Bayesian leave-one-out cross-validation for large data
Model inference, such as model comparison, model checking, and model
selection, is an important part of model development. Leave-one-out
cross-validation (LOO) is a general approach for assessing the generalizability
of a model, but unfortunately, LOO does not scale well to large datasets. We
propose a combination of using approximate inference techniques and
probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation
for large datasets. We provide both theoretical and empirical results showing
good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape
- …