Search CORE

116,280 research outputs found

Combining Models of Approximation with Partial Learning

Author: D Angluin
DN Osherson
EM Gold
J Case
J Case
M Fulk
S Jain
Z Gao
Z Gao
Publication venue
Publication date: 22/07/2015
Field of study

In Gold's framework of inductive inference, the model of partial learning requires the learner to output exactly one correct index for the target object and only the target object infinitely often. Since infinitely many of the learner's hypotheses may be incorrect, it is not obvious whether a partial learner can be modifed to "approximate" the target object. Fulk and Jain (Approximate inference and scientific method. Information and Computation 114(2):179--191, 1994) introduced a model of approximate learning of recursive functions. The present work extends their research and solves an open problem of Fulk and Jain by showing that there is a learner which approximates and partially identifies every recursive function by outputting a sequence of hypotheses which, in addition, are also almost all finite variants of the target function. The subsequent study is dedicated to the question how these findings generalise to the learning of r.e. languages from positive data. Here three variants of approximate learning will be introduced and investigated with respect to the question whether they can be combined with partial learning. Following the line of Fulk and Jain's research, further investigations provide conditions under which partial language learners can eventually output only finite variants of the target language. The combinabilities of other partial learning criteria will also be briefly studied.Comment: 28 page

arXiv.org e-Print Archive

Crossref

Scalable Recommendation with Poisson Factorization

Author: Blei David M.
Gopalan Prem
Hofman Jake M.
Publication venue
Publication date: 20/05/2014
Field of study

We develop a Bayesian Poisson matrix factorization model for forming recommendations from sparse user behavior data. These data are large user/item matrices where each user has provided feedback on only a small subset of items, either explicitly (e.g., through star ratings) or implicitly (e.g., through views or purchases). In contrast to traditional matrix factorization approaches, Poisson factorization implicitly models each user's limited attention to consume items. Moreover, because of the mathematical form of the Poisson likelihood, the model needs only to explicitly consider the observed entries in the matrix, leading to both scalable computation and good predictive performance. We develop a variational inference algorithm for approximate posterior inference that scales up to massive data sets. This is an efficient algorithm that iterates over the observed entries and adjusts an approximate posterior over the user/item representations. We apply our method to large real-world user data containing users rating movies, users listening to songs, and users reading scientific papers. In all these settings, Bayesian Poisson factorization outperforms state-of-the-art matrix factorization methods

arXiv.org e-Print Archive

CiteSeerX

Group equivariant neural posterior estimation

Author: Dax M.
Deistler M.
Gair J.
Green S.
Macke J.
Schölkopf B.
Publication venue
Publication date: 01/01/2022
Field of study

Simulation-based inference with conditional neural density estimators is a powerful approach to solving inverse problems in science. However, these methods typically treat the underlying forward model as a black box, with no way to exploit geometric properties such as equivariances. Equivariances are common in scientific models, however integrating them directly into expressive inference networks (such as normalizing flows) is not straightforward. We here describe an alternative method to incorporate equivariances under joint transformations of parameters and data. Our method -- called group equivariant neural posterior estimation (GNPE) -- is based on self-consistently standardizing the "pose" of the data while estimating the posterior over parameters. It is architecture-independent, and applies both to exact and approximate equivariances. As a real-world application, we use GNPE for amortized inference of astrophysical binary black hole systems from gravitational-wave observations. We show that GNPE achieves state-of-the-art accuracy while reducing inference times by three orders of magnitude

MPG.PuRe

Group equivariant neural posterior estimation

Author: Dax Maximilian
Deistler Michael
Gair Jonathan
Green Stephen R.
Macke Jakob H.
Schölkopf Bernhard
Publication venue
Publication date: 25/11/2021
Field of study

arXiv.org e-Print Archive

Whither PQL?

Author: Breslow Norm
Publication venue: Collection of Biostatistics Research Archive
Publication date: 24/01/2003
Field of study

Generalized linear mixed models (GLMM) are generalized linear models with normally distributed random effects in the linear predictor. Penalized quasi-likelihood (PQL), an approximate method of inference in GLMMs, involves repeated fitting of linear mixed models with “working” dependent variables and iterative weights that depend on parameter estimates from the previous cycle of iteration. The generality of PQL, and its implementation in commercially available software, has encouraged the application of GLMMs in many scientific fields. Caution is needed, however, since PQL may sometimes yield badly biased estimates of variance components, especially with binary outcomes. Recent developments in numerical integration, including adaptive Gaussian quadrature, higher order Laplace expansions, stochastic integration and Markov chain Monte Carlo (MCMC) algorithms, provide attractive alternatives to PQL for approximate likelihood inference in GLMMs. Analyses of some well known datasets, and simulations based on these analyses, suggest that PQL still performs remarkably well in comparison with more elaborate procedures in many practical situations. Adaptive Gaussian quadrature is a viable alternative for nested designs where the numerical integration is limited to a small number of dimensions. Higher order Laplace approximations hold the promise of accurate inference more generally. MCMC is likely the method of choice for the most complex problems that involve high dimensional integrals

Collection Of Biostatistics Research Archive

Approximate Bayesian inference for individual-based models with emergent dynamics

Author: Campioni Nazareno
Gaskell Jennifer
Husmeier Dirk
Morales Juan M.
Torney Colin J.
Publication venue: 'Avestia Publishing'
Publication date: 19/08/2020
Field of study

Individual-based models are used in a variety of scientific domains to study systems composed of multiple agents that interact with one another and lead to complex emergent dynamics at the macroscale. A standard approach in the analysis of these systems is to specify the microscale interaction rules in a simulation model, run simulations, and then qualitatively compare outputs to empirical observations. Recently, more robust methods for inference for these types of models have been introduced, notably approximate Bayesian computation, however major challenges remain due to the computational cost of simulations and the nonlinear nature of many complex systems. Here, we compare two methods of approximate inference in a classic individual-based model of group dynamics with well-studied nonlinear macroscale behaviour; we employ a Gaussian process accelerated ABC method with an approximated likelihood and with a synthetic likelihood. We compare the accuracy of results when re-inferring parameters using a measure of macro-scale disorder (the order parameter) as a summary statistic. Our findings reveal that for a canonical simple model of animal collective movement, parameter inference is accurate and computationally efficient, even when the model is poised at the critical transition between order and disorder

Crossref

Enlighten

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

Author: Fan Jianqing
Gu Weijie
Han Xu
Publication venue
Publication date: 15/11/2011
Field of study

Multiple hypothesis testing is a fundamental problem in high dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any SNPs are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In the current paper, we propose a novel method based on principal factor approximation, which successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling FDR and FDP. Our estimate of realized FDP compares favorably with Efron (2007)'s approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure, which is more powerful than the fixed threshold procedure.Comment: 51 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1012.439

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

PubMed Central