19 research outputs found

    Statistical post-processing of hydrological forecasts using Bayesian model averaging

    Get PDF
    Accurate and reliable probabilistic forecasts of hydrological quantities like runoff or water level are beneficial to various areas of society. Probabilistic state-of-the-art hydrological ensemble prediction models are usually driven with meteorological ensemble forecasts. Hence, biases and dispersion errors of the meteorological forecasts cascade down to the hydrological predictions and add to the errors of the hydrological models. The systematic parts of these errors can be reduced by applying statistical post-processing. For a sound estimation of predictive uncertainty and an optimal correction of systematic errors, statistical post-processing methods should be tailored to the particular forecast variable at hand. Former studies have shown that it can make sense to treat hydrological quantities as bounded variables. In this paper, a doubly truncated Bayesian model averaging (BMA) method, which allows for flexible post-processing of (multi-model) ensemble forecasts of water level, is introduced. A case study based on water level for a gauge of river Rhine, reveals a good predictive skill of doubly truncated BMA compared both to the raw ensemble and the reference ensemble model output statistics approach.Comment: 19 pages, 6 figure

    Data augmentation for models based on rejection sampling

    Full text link
    We present a data augmentation scheme to perform Markov chain Monte Carlo inference for models where data generation involves a rejection sampling algorithm. Our idea, which seems to be missing in the literature, is a simple scheme to instantiate the rejected proposals preceding each data point. The resulting joint probability over observed and rejected variables can be much simpler than the marginal distribution over the observed variables, which often involves intractable integrals. We consider three problems, the first being the modeling of flow-cytometry measurements subject to truncation. The second is a Bayesian analysis of the matrix Langevin distribution on the Stiefel manifold, and the third, Bayesian inference for a nonparametric Gaussian process density model. The latter two are instances of problems where Markov chain Monte Carlo inference is doubly-intractable. Our experiments demonstrate superior performance over state-of-the-art sampling algorithms for such problems.Comment: 6 figures. arXiv admin note: text overlap with arXiv:1311.090

    Evaluation of microarray-based DNA methylation measurement using technical replicates: the Atherosclerosis Risk In Communities (ARIC) Study

    Get PDF
    Background: DNA methylation is a widely studied epigenetic phenomenon; alterations in methylation patterns influence human phenotypes and risk of disease. As part of the Atherosclerosis Risk in Communities (ARIC) study, the Illumina Infinium HumanMethylation450 (HM450) BeadChip was used to measure DNA methylation in peripheral blood obtained from ~3000 African American study participants. Over 480,000 cytosine-guanine (CpG) dinucleotide sites were surveyed on the HM450 BeadChip. To evaluate the impact of technical variation, 265 technical replicates from 130 participants were included in the study. Results: For each CpG site, we calculated the intraclass correlation coefficient (ICC) to compare variation of methylation levels within- and between-replicate pairs, ranging between 0 and 1. We modeled the distribution of ICC as a mixture of censored or truncated normal and normal distributions using an EM algorithm. The CpG sites were clustered into low- and high-reliability groups, according to the calculated posterior probabilities. We also demonstrated the performance of this clustering when applied to a study of association between methylation levels and smoking status of individuals. For the CpG sites showing genome-wide significant association with smoking status, most (~96%) were seen from sites in the high reliability cluster. Conclusions: We suggest that CpG sites with low ICC may be excluded from subsequent association analyses, or extra caution needs to be taken for associations at such sites
    corecore