9,465 research outputs found

    Bayesian Conditional Density Filtering

    Full text link
    We propose a Conditional Density Filtering (C-DF) algorithm for efficient online Bayesian inference. C-DF adapts MCMC sampling to the online setting, sampling from approximations to conditional posterior distributions obtained by propagating surrogate conditional sufficient statistics (a function of data and parameter estimates) as new data arrive. These quantities eliminate the need to store or process the entire dataset simultaneously and offer a number of desirable features. Often, these include a reduction in memory requirements and runtime and improved mixing, along with state-of-the-art parameter inference and prediction. These improvements are demonstrated through several illustrative examples including an application to high dimensional compressed regression. Finally, we show that C-DF samples converge to the target posterior distribution asymptotically as sampling proceeds and more data arrives.Comment: 41 pages, 7 figures, 12 table

    On Obtaining Stable Rankings

    Full text link
    Decision making is challenging when there is more than one criterion to consider. In such cases, it is common to assign a goodness score to each item as a weighted sum of its attribute values and rank them accordingly. Clearly, the ranking obtained depends on the weights used for this summation. Ideally, one would want the ranked order not to change if the weights are changed slightly. We call this property {\em stability} of the ranking. A consumer of a ranked list may trust the ranking more if it has high stability. A producer of a ranked list prefers to choose weights that result in a stable ranking, both to earn the trust of potential consumers and because a stable ranking is intrinsically likely to be more meaningful. In this paper, we develop a framework that can be used to assess the stability of a provided ranking and to obtain a stable ranking within an "acceptable" range of weight values (called "the region of interest"). We address the case where the user cares about the rank order of the entire set of items, and also the case where the user cares only about the top-kk items. Using a geometric interpretation, we propose algorithms that produce stable rankings. In addition to theoretical analyses, we conduct extensive experiments on real datasets that validate our proposal

    Functional foods : a conceptual model for assessing their safety and effectiveness

    Get PDF
    This report shows that the product-diet dilemma can be solved by developing a predictive model. The model integrates food intake data, dynamic consumption patterns and the production chain model and combines them with a risk-benefit approach

    Bayesian Markov-chain-Monte-Carlo inversion of time-lapse cross hole ground-penetrating radar data to characterize the vadose zone at the Arrenaes field site, Denmark

    Get PDF
    The ground-penetrating radar (GPR) geophysical method has the potential to provide valuable information on the hydraulic properties of the vadose zone because of its strong sensitivity to soil water content. In particular, recent evidence has suggested that the stochastic inversion of crosshole GPR traveltime data can allow for a significant reduction in uncertainty regarding subsurface van Genuchten-Mualem (VGM) parameters. Much of the previous work on the stochastic estimation of VGM parameters from crosshole GPR data has considered the case of steady-state infiltration conditions, which represent only a small fraction of practically relevant scenarios. We explored in detail the dynamic infiltration case, specifically examining to what extent time-lapse crosshole GPR traveltimes, measured during a forced infiltration experiment at the Arreneas field site in Denmark, could help to quantify VGM parameters and their uncertainties in a layered medium, as well as the corresponding soil hydraulic properties. We used a Bayesian Markov-chain-Monte-Carlo inversion approach. We first explored the advantages and limitations of this approach with regard to a realistic synthetic example before applying it to field measurements. In our analysis, we also considered different degrees of prior information. Our findings indicate that the stochastic inversion of the time-lapse GPR data does indeed allow for a substantial refinement in the inferred posterior VGM parameter distributions compared with the corresponding priors, which in turn significantly improves knowledge of soil hydraulic properties. Overall, the results obtained clearly demonstrate the value of the information contained in time-lapse GPR data for characterizing vadose zone dynamics

    Data inaccuracy quantification and uncertainty propagation for bibliometric indicators

    Full text link
    This study introduces an approach to estimate the uncertainty in bibliometric indicator values that is caused by data errors. This approach utilizes Bayesian regression models, estimated from empirical data samples, which are used to predict error-free data. Through direct Monte Carlo simulation -- drawing predicted data from the estimated regression models a large number of times for the same input data -- probability distributions for indicator values can be obtained, which provide the information on their uncertainty due to data errors. It is demonstrated how uncertainty in base quantities, such as the number of publications of a unit of certain document types and the number of citations of a publication, can be propagated along a measurement model into final indicator values. This method can be used to estimate the uncertainty of indicator values due to sources of errors with known error distributions. The approach is demonstrated with simple synthetic examples for instructive purposes and real bibliometric research evaluation data to show its possible application in practice.Comment: 31 pages, 5 figure

    Modeling methane emissions from US natural gas operations: national gathering station emission factor development and facility/regional-scale top-down to bottom-up reconciliations

    Get PDF
    2017 Summer.Includes bibliographical references.United States natural gas dry production increased by 47% between 2005 and 2015 due to the widespread use of horizontal drilling and hydraulic fracturing to extract gas from shale and other tight formations. Natural gas production and consumption is projected to continue to increase for the foreseeable future. In 2016, the natural gas supply chain delivered 29% of the energy used in the U.S., and natural gas surpassed coal as the leading electricity generating source for the first time in U.S. history. When combusted, natural gas produces less CO2 per unit energy released compared to coal or petroleum. However, uncombusted methane (the primary component of natural gas) has a global warming potential 30 times higher than CO2 on a 100 year time horizon (including oxidation to CO2, but excluding climate-carbon feedbacks). Therefore, the net greenhouse gas impacts resulting from displacement of coal and petroleum by natural gas depend on the emission rate of uncombusted natural gas. Short term climate benefits resulting from coal substitution, for example, are lost if the net rate of methane (CH4) emission from the natural gas supply chain exceeds 3—4% . Three studies were conducted to quantify CH4 emissions from the natural gas industry. In particular, these studies focused on quantifying emissions from the gathering and processing sector and reconciling emissions estimates developed using top-down (tracer flux and aircraft) vs. bottom-up (on-site component-level) measurement approaches. In the first study, facility-level CH4 emissions measurements were made at 114 natural gas gathering facilities and 16 processing plants in 13 U.S. states during a 20-week field campaign conducted from October 2013 through April 2014. Measurement results were combined with facility counts obtained from state air permit databases and national inventories in a Monte Carlo simulation to estimate CH4 emissions from U.S. natural gas gathering and processing operations. Annual CH4 emissions from normal operations at gathering facilities totaled 1699 Gg (95% CI=1539—1863 Gg), while normal operations at processing plants totaled 505 Gg (95% CI=459—548 Gg). CH4 emissions from abnormal operations at gathering facilities were estimated in a separate Monte Carlo simulation based on field observations and a sub-set of field measurements. These emissions totaled 169 Gg (+426%/-96%). In the second study, coordinated dual-tracer, aircraft-based, and direct component-level measurements were made at midstream natural gas gathering and boosting stations in the Fayetteville shale in Arkansas, USA. On-site component-level measurements were combined with engineering estimates to generate comprehensive facility-level CH4 emission rate estimates ("study on-site estimates (SOE)") comparable to tracer and aircraft measurements. Concurrent measurements at 14 normally-operating facilities showed a strong correlation between tracer and SOE, but indicated that tracer measurements estimated lower emissions (regression of tracer to SOE=0.91 (95% CI=0.83—0.99, R2=0.89). Tracer and SOE 95% confidence intervals overlapped at 11/14 facilities. Contemporaneous measurements at six facilities suggested that aircraft measurements estimated higher emissions than SOE. Aircraft and study on-site estimate 95% confidence intervals overlapped at 3/6 facilities. In the third study, a detailed spatiotemporal inventory model was developed and used to reconcile top down and bottom-up CH4 emission estimates from natural gas infrastructure and other sources in the Fayetteville shale on two consecutive days. On Thursday October 1, 2015 13:00—15:00 CDT top-down aircraft mass balance flights estimated 28.7 (20.1—37.3 Mg/h 95% CI) from the study area, while the bottom-up ground level area estimate predicted 23.9 (20.9—27.3 Mg/h 95% CI). On Friday October 2, 2015 14:30—16:30 CDT top-down estimated 36.7 (21.3—52.1 Mg/h 95% CI), while bottom-up estimated 21.1 (18.4—24.2 Mg/h 95% CI). Production and gathering activities were the largest contributors to modeled CH4 emissions. In contrast to prior studies, comparisons on two consecutive days indicated overlapping confidence intervals between top-down aircraft estimates and bottom-up inventory-driven estimates. Operator participation and extensive activity data proved critical in understanding emissions as observed by aircraft. In particular, the agreement obtained was possible only because bottom-up models included the variability in production maintenance activities, which showed substantially higher emissions during daytime hours when aircraft-based measurements were performed. Results indicated that that poor activity estimates (counts and timing) for large episodic events likely drives divergence in CH4 emission estimates from production basins, and that even more precise activity data would be required to improve agreement between these two approaches
    corecore