961 research outputs found

    Merging expert and empirical data for rare event frequency estimation : pool homogenisation for empirical Bayes models

    Get PDF
    Empirical Bayes provides one approach to estimating the frequency of rare events as a weighted average of the frequencies of an event and a pool of events. The pool will draw upon, for example, events with similar precursors. The higher the degree of homogeneity of the pool, then the Empirical Bayes estimator will be more accurate. We propose and evaluate a new method using homogenisation factors under the assumption that events are generated from a Homogeneous Poisson Process. The homogenisation factors are scaling constants, which can be elicited through structured expert judgement and used to align the frequencies of different events, hence homogenising the pool. The estimation error relative to the homogeneity of the pool is examined theoretically indicating that reduced error is associated with larger pool homogeneity. The effects of misspecified expert assessments of the homogenisation factors are examined theoretically and through simulation experiments. Our results show that the proposed Empirical Bayes method using homogenisation factors is robust under different degrees of misspecification

    Evaluating traffic safety network screening: an initial framework utilizing the hierarchical Bayesian philosophy

    Get PDF
    Highway crashes result in over 40,000 deaths per year (500,000 worldwide). Their impact on the national economy is estimated at more than 230 billion dollars. Highway safety is the top priority of the United States Department of Transportation (US DOT). Funds dedicated to the problem are expected to increase substantially.;Highway safety is a multidisciplinary issue. An important tool is the safety improvement candidate location (SICL) list. SICL lists list high crash locations for potential mitigation. SICL lists are developed using crash data. Crash frequency, rate, or loss is used to rank the worst locations. Classical statistical techniques are applied. In some cases, simple frequency analyses are used to draw attention to problem locations.;Simple ranked lists suffer from methodological and practical limitations. Chief among these is the inability to identify sites with promise , sites where mitigation has the best chance of success. Agencies representing engineering and enforcement generally examine top sites prior to resource dedication. This is resource intensive and efforts of different safety interests are often not well coordinated.;For over 20 years, empirical Bayesian (EB) has been proposed to address these limitations. EB identifies sites where mitigation might be most effective, increases estimate confidence, and provides information on relative site safety. EB is being widely implemented at the national level. State and local agencies continue SILL development based on long-standing procedures.;EB allows decision makers to more reliably estimate the crash reduction potential at specific sites. However, EB requires development of safety performance functions for road type classes. The technique also requires a priori development of accident modification factors. These requirements add significant expense.;Powerful computers and advanced statistical sampling techniques allow hierarchical Bayesian statistics to be applied to highway safety. Hierarchical Bayesian eliminates the need for a priori functions and factors. This approach can readily incorporate additional information. It can also explicitly identify important relationships between causal factors and safety performance. The approach uses data to define results, based on an analyst-specified level of uncertainty. This dissertation discusses SICL list development and evaluates the potential of Bayesian statistics to improve their utility

    Comparison study on AIS data of ship traffic behavior

    Get PDF
    AIS (Automatic Identification System) data provides valuable input parameters in ship traffic simulation models for maritime risk analysis and the prevention of shipping accidents. This article reports on the detailed comparisons of AIS data analysis between a Dutch case and a Chinese case. This analys is focuses on restricted waterways to support inland waterway simulations, comparing the differences between a narrow waterway in the Netherlands (the Port of Rotterdam) and a wide one in China (wide water way of Yangtze River close to the SuTong Bridge). It is shown that straightforward statistical distributions can be used to characterise lateral position, speed, heading and interval times for different types and sizes of ships. However, the distributions for different characteristics of ship behaviours differ significantly

    The regression analysis of group truncated data

    Get PDF
    This thesis considers the regression modelling of grouped binary data that is subject to truncation, and explores some general issues relating to truncation. The likelihood for simple binary and ordinal models is developed and the statistical behaviour of these models is explored. The models are found to be well behaved. The efficiency of the truncated model is compared with that of conditional logistic regression, a competing technique. It is found that the truncated model is always more efficient but requires additional assumptions about the data generation process to be applicable. The estimation of the full sample size, N , before truncation occurs is considered, in quite general regression models. The case where the covariate distribution is discrete is first considered. This is extended to allow continuous covariates, and the additional difficulties involved are explored. The issue of setting confidence intervals for N is discussed. A simulation study is used to explore the methods behaviour. Next, the Bayesian analysis of truncated regression models is considered. The use of the empirical distribution of the observed covariates to facilitate the analysis is explored. The posterior distribution of the models parameters under this approach is derived and a Gibbs sampling algorithm implemented to explore the posterior. The convergence properties of the algorithm is considered, and the techniques behaviour assessed in a small simulation study. The effect of over-dispersion on the analysis of group truncated binary data is considered. The available methods of introducing over-dispersion in clustered binary data are discussed and it is argued that only random effects models provide a viable approach. Parameter estimation in these models is derived via a marginal likelihood. In addition a score test is constructed to test for the presence of random effects in group truncated binary data. The methods performance is demonstrated using a simulation study. Finally, the use of the bootstrap to estimate the sampling distribution of parameter estimates from truncated data is considered in an appendix. The inherent limitations of using resampling methodologies to investigate truncated data is demonstrated. It is shown that the nonparametric advantages of the bootstrap are not realised with truncated data due to the lack of observations on the truncated class

    There is Nothing Magical About Bayesian Statistics: An Introduction to Epistemic Probabilities in Data Analysis for Psychology Starters

    Get PDF
    This paper is a reader-friendly introduction to Bayesian inference applied to psychological science. We begin by explaining the difference between frequentist and epistemic interpretations of probability that underpin respectively frequentist and Bayesian statistics. We use a concrete example – a student wondering whether s/he carries the virus statisticus malignum – to explain how both approaches are different one from another. We illustrate Bayesian inference with intuitive examples, before introducing the mathematical framework. Different schools of thoughts and recommendations are discussed to illustrate how to use priors in Bayes Factor testing. We discuss how psychology could benefit from a greater reliance on Bayesian methods. Finally, we illustrate how to compute Bayes Factors analyses with real data and provide the R code

    Eliciting hyperparameters of prior distributions for the parameters of paired comparison models

    Full text link

    Bayesian Hierarchical Factor Regression Models to Infer Cause of Death From Verbal Autopsy Data

    Full text link
    In low-resource settings where vital registration of death is not routine it is often of critical interest to determine and study the cause of death (COD) for individuals and the cause-specific mortality fraction (CSMF) for populations. Post-mortem autopsies, considered the gold standard for COD assignment, are often difficult or impossible to implement due to deaths occurring outside the hospital, expense, and/or cultural norms. For this reason, Verbal Autopsies (VAs) are commonly conducted, consisting of a questionnaire administered to next of kin recording demographic information, known medical conditions, symptoms, and other factors for the decedent. This article proposes a novel class of hierarchical factor regression models that avoid restrictive assumptions of standard methods, allow both the mean and covariance to vary with COD category, and can include covariate information on the decedent, region, or events surrounding death. Taking a Bayesian approach to inference, this work develops an MCMC algorithm and validates the FActor Regression for Verbal Autopsy (FARVA) model in simulation experiments. An application of FARVA to real VA data shows improved goodness-of-fit and better predictive performance in inferring COD and CSMF over competing methods. Code and a user manual are made available at https://github.com/kelrenmor/farva

    Bayesian Approach on Quantifying the Safety Effects of Pedestrian Countdown Signals to Drivers

    Get PDF
    Pedestrian countdown signals (PCSs) are viable traffic control devices that assist pedestrians in crossing intersections safely. Despite the fact that PCSs are meant for pedestrians, they also have an impact on drivers’ behavior at intersections. This study focuses on the evaluation of the safety effectiveness of PCSs to drivers in the cities of Jacksonville and Gainesville, Florida. The study employs two Bayesian approaches, before-and-after empirical Bayes (EB) and full Bayes (FB) with a comparison group, to quantify the safety impacts of PCSs to drivers. Specifically, crash modification factors (CMFs), which are estimated using the aforementioned two methods, were used to evaluate the safety effects of PCSs to drivers. Apart from establishing CMFs, crash modification functions (CMFunctions) were also developed to observe the relationship between CMFs and traffic volume. The CMFs were established for distinctive categories of crashes based on crash type (rear-end and angle collisions) and severity level (total, fatal and injury (FI), and property damage only (PDO) collisions). The CMFs findings, using the EB approach indicated that installing PCSs result in a significant improvement of driver’s safety, at a 95% confidence interval (CI), by a 8.8% reduction in total crashes, a 8.0% reduction in rear-end crashes, and a 7.1% reduction in PDO crashes. In addition, FI crashes and angle crashes were observed to be reduced by 4.8%, whereas a 4.6% reduction in angle crashes was observed. In the case of the FB approach, PCSs were observed to be effective and significant, at a 95% Bayesian credible interval (BCI), for a total (Mean = 0.894, 95% BCI (0.828, 0.911)), PDO (Mean = 0.908, 95% BCI (0.838, 0.953)), and rear-end (Mean = 0.920, 95% BCI (0.842, 0.942)) crashes. The results of two crash categories such as FI (Mean = 0.957, 95% BCI (0.886, 1. 020)) and angle (Mean = 0.969, 95% BCI (0.931, 1.022)) crashes are less than one but are not significant at the 95 % BCI. Also, discussed in this study are the CMFunctions, showing the relationship between the developed CMFs and total entering traffic volume, obtained by combining the total traffic on the major and the minor approaches. In addition, the CMFunctions developed using the FB indicated the relationship between the estimated CMFs with the post-treatment year. The CMFunctions developed in this study clearly show that the treatment effectiveness varies considerably with post-treatment time and traffic volume. Moreover, using the FB methodology, the results suggest the treatment effectiveness increased over time in the post-treatment years for the crash categories with two important indicators of effectiveness, i.e., total and PDO, and rear-end crashes. Nevertheless, the treatment effectiveness on rear-end crashes is observed to decline with post-treatment time, although the base value is still less than one for all the three years. In summary, the results suggest the usefulness of PCSs for drivers
    corecore