1,389 research outputs found

    Statistical inference for generative models with maximum mean discrepancy

    Get PDF
    While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation is cheap. The distance considered, maximum mean discrepancy (MMD), is defined through the embedding of probability measures into a reproducing kernel Hilbert space. We study the theoretical properties of these estimators, showing that they are consistent, asymptotically normal and robust to model misspecification. A main advantage of these estimators is the flexibility offered by the choice of kernel, which can be used to trade-off statistical efficiency and robustness. On the algorithmic side, we study the geometry induced by MMD on the parameter space and use this to introduce a novel natural gradient descent-like algorithm for efficient implementation of these estimators. We illustrate the relevance of our theoretical results on several classes of models including a discrete-time latent Markov process and two multivariate stochastic differential equation models

    Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization

    Get PDF
    Control variates are a well-established tool to reduce the variance of Monte Carlo estimators. However, for large-scale problems including high-dimensional and large-sample settings, their advantages can be outweighed by a substantial computational cost. This paper considers control variates based on Stein operators, presenting a framework that encompasses and generalizes existing approaches that use polynomials, kernels and neural networks. A learning strategy based on minimising a variational objective through stochastic optimization is proposed, leading to scalable and effective control variates. Novel theoretical results are presented to provide insight into the variance reduction that can be achieved, and an empirical assessment, including applications to Bayesian inference, is provided in support

    A Comparison of Measured and Self-Reported Blood Pressure Status among Low-Income Housing Residents in New York City

    Full text link
    Self-report is widely used to measure hypertension prevalence in population-based studies, but there is little research comparing self-report with measured blood pressure among low-income populations. The objective of this study was to compare self-reported and measured blood pressure status among a sample of low-income housing residents in New York City (n=118). We completed a cross-sectional analysis comparing self-report with measured blood pressure status. We determined the sensitivity, specificity, and positive predictive value (PPV) of each self-report metric. Of the sample, 68.1% was Black, 71.1% had a household income under $25,000/year, and 28.5% did not complete high school. In our study, there was a discrepancy in the prevalence hypertension by self-report (30.5%) versus measurement (39.8%). PPV of self-report was 94.4%. Specificity was 97.2%. Hypertension awareness (sensitivity) was 72.3%. Of individuals not reporting hypertension, 15.9% had measurements in the hypertensive range and 43.9% had measurements in the borderline hypertensive range. Our findings suggest that self-reported and objective measures of hypertension are incongruent among low-income housing residents and may have important implications for population-based research among low-income populations

    Note on A. Barbour’s paper on Stein’s method for diffusion approximations

    Get PDF
    In [2] foundations for diffusion approximation via Stein’s method are laid. This paper has been cited more than 130 times and is a cornerstone in the area of Stein’s method (see, for example, its use in [1] or [7]). A semigroup argument is used in [2] to solve a Stein equation for Gaussian diffusion approximation. We prove that, contrary to the claim in [2], the semigroup considered therein is not strongly continuous on the Banach space of continuous, real-valued functions on D[0,1] growing slower than a cubic, equipped with an appropriate norm. We also provide a proof of the exact formulation of the solution to the Stein equation of interest, which does not require the aforementioned strong continuity. This shows that the main results of [2] hold true

    Hierarchical Bayesian modeling for knowledge transfer across engineering fleets via multitask learning

    Get PDF
    A population-level analysis is proposed to address data sparsity when building predictive models for engineering infrastructure. Utilizing an interpretable hierarchical Bayesian approach and operational fleet data, domain expertise is naturally encoded (and appropriately shared) between different subgroups, representing (1) use-type, (2) component, or (3) operating condition. Specifically, domain expertise is exploited to constrain the model via assumptions (and prior distributions) allowing the methodology to automatically share information between similar assets, improving the survival analysis of a truck fleet (15% and 13% increases in predictive log-likelihood of hazard) and power prediction in a wind farm (up to 82% reduction in the standard deviation of maximum output prediction). In each asset management example, a set of correlated functions is learnt over the fleet, in a combined inference, to learn a population model. Parameter estimation is improved when subfleets are allowed to share correlated information at different levels in the hierarchy; the (averaged) reduction in standard deviation for interpretable parameters in the survival analysis is 70%, alongside 32% in wind farm power models. In turn, groups with incomplete data automatically borrow statistical strength from those that are data-rich. The statistical correlations enable knowledge transfer via Bayesian transfer learning, and the correlations can be inspected to inform which assets share information for which effect (i.e., parameter). Successes in both case studies demonstrate the wide applicability in practical infrastructure monitoring, since the approach is naturally adapted between interpretable fleet models of different in situ examples

    Enemies with benefits: parasitic endoliths protect mussels against heat stress

    Get PDF
    Positive and negative aspects of species interactions can be context dependant and strongly affected by environmental conditions. We tested the hypothesis that, during periods of intense heat stress, parasitic phototrophic endoliths that fatally degrade mollusc shells can benefit their mussel hosts. Endolithic infestation significantly reduced body temperatures of sun-exposed mussels and, during unusually extreme heat stress, parasitised individuals suffered lower mortality rates than nonparasitised hosts. This beneficial effect was related to the white discolouration caused by the excavation activity of endoliths. Under climate warming, species relationships may be drastically realigned and conditional benefits of phototrophic endolithic parasites may become more important than the costs of infestation

    Interactions Between Policy Effects, Population Characteristics and the Tax-Benefit System: An Illustration Using Child Poverty and Child Related Policies in Romania and the Czech Republic

    Get PDF
    We investigate the impact of the Romanian and Czech family policy systems on the poverty risk of families with children. We focus on separating out the effects of policy design itself and size of benefits from the interaction between policies and population characteristics. We find that interactions between population characteristics, the wider tax benefit system and child related policies are pervasive and large. Both population characteristics and the wider tax-benefit environment can dramatically alter the antipoverty effect of a given set of policies
    • …
    corecore