225,239 research outputs found

    Automatic Differentiation Variational Inference

    Full text link
    Probabilistic modeling is iterative. A scientist posits a simple model, fits it to her data, refines it according to her analysis, and repeats. However, fitting complex models to large data is a bottleneck in this process. Deriving algorithms for new models can be both mathematically and computationally challenging, which makes it difficult to efficiently cycle through the steps. To this end, we develop automatic differentiation variational inference (ADVI). Using our method, the scientist only provides a probabilistic model and a dataset, nothing else. ADVI automatically derives an efficient variational inference algorithm, freeing the scientist to refine and explore many models. ADVI supports a broad class of models-no conjugacy assumptions are required. We study ADVI across ten different models and apply it to a dataset with millions of observations. ADVI is integrated into Stan, a probabilistic programming system; it is available for immediate use

    Automatic Variational Inference in Stan

    Full text link
    Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult to automate. We propose an automatic variational inference algorithm, automatic differentiation variational inference (ADVI). The user only provides a Bayesian model and a dataset; nothing else. We make no conjugacy assumptions and support a broad class of models. The algorithm automatically determines an appropriate variational family and optimizes the variational objective. We implement ADVI in Stan (code available now), a probabilistic programming framework. We compare ADVI to MCMC sampling across hierarchical generalized linear models, nonconjugate matrix factorization, and a mixture model. We train the mixture model on a quarter million images. With ADVI we can use variational inference on any model we write in Stan

    Halorubrum chaoviator sp. nov., a haloarchaeon isolated from sea salt in Baja California, Mexico, Western Australia and Naxos, Greece

    Get PDF
    hree halophilic isolates, strains Halo-G*T, AUS-1 and Naxos II, were compared. Halo-G* was isolated from an evaporitic salt crystal from Baja California, Mexico, whereas AUS-1 and Naxos II were isolated from salt pools in Western Australia and the Greek island of Naxos, respectively. Halo-G*T had been exposed previously to conditions of outer space and survived 2 weeks on the Biopan facility. Chemotaxonomic and molecular comparisons suggested high similarity between the three strains. Phylogenetic analysis based on the 16S rRNA gene sequences revealed that the strains clustered with Halorubrum species, showing sequence similarities of 99.2–97.1 %. The DNA–DNA hybridization values of strain Halo-G*T and strains AUS-1 and Naxos II are 73 and 75 %, respectively, indicating that they constitute a single species. The DNA relatedness between strain Halo-G*T and the type strains of 13 closely related species of the genus Halorubrum ranged from 39 to 2 %, suggesting that the three isolates constitute a different genospecies. The G+C content of the DNA of the three strains was 65.5–66.5 mol%. All three strains contained C20C20 derivatives of diethers of phosphatidylglycerol, phosphatidylglyceromethylphosphate and phosphatidylglycerolsulfate, together with a sulfated glycolipid. On the basis of these results, a novel species that includes the three strains is proposed, with the name Halorubrum chaoviator sp. nov. The type strain is strain Halo-G*T (=DSM 19316T =NCIMB 14426T =ATCC BAA-1602T)

    Bayesian leave-one-out cross-validation for large data

    Full text link
    Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape
    corecore