13 research outputs found
Modified confidence intervals for the Mahalanobis distance
Reiser (2001) proposes a method of forming confidence interval for a Mahalanobis distance that yields intervals which have exactly the nominal coverage, but sometimes the interval is View the MathML source (0,0). We consider the case where Mahalanobis distance quantifies the difference between an individual and a population mean, and suggest a modification that avoids implausible intervals
Recommended from our members
BIMAMâa tool for imputing variables missing across datasets using a Bayesian imputation and analysis model
Motivation Combination of multiple datasets is routine in modern epidemiology. However, studies may have measured different sets of variables; this is often inefficiently dealt with by excluding studies or dropping variables. Multilevel multiple imputation methods to impute these âsystematicallyâ missing data (as opposed to âsporadicallyâ missing data within a study) are available, but problems may arise when many random effects are needed to allow for heterogeneity across studies. We show that the Bayesian IMputation and Analysis Model (BIMAM) implemented in our tool works well in this situation.
General features BIMAM performs imputation and analysis simultaneously. It imputes both binary and continuous systematically and sporadically missing data, and analyses binary and continuous outcomes. BIMAM is a user-friendly, freely available tool that does not require knowledge of Bayesian methods. BIMAM is an R Shiny application. It is downloadable to a local machine and it automatically installs the required freely available packages (R packages, including R2MultiBUGS and MultiBUGS).
Availability BIMAM is available at [www.alecstudy.org/bimam]
On point estimation of the abnormality of a Mahalanobis index
Mahalanobis distance may be used as a measure of the disparity between an individualâs profile of scores and the average profile of a population of controls. The degree to which the individualâs profile is unusual can then be equated to the proportion of the population who would have a larger Mahalanobis distance than the individual. Several estimators of this proportion are examined. These include plug-in maximum likelihood estimators, medians, the posterior mean from a Bayesian probability matching prior, an estimator derived from a Taylor expansion, and two forms of polynomial approximation, one based on Bernstein polynomial and one on a quadrature method. Simulations show that some estimators, including the commonly-used plug-in maximum likelihood estimators, can have substantial bias for small or moderate sample sizes. The polynomial approximations yield estimators that have low bias, with the quadrature method marginally to be preferred over Bernstein polynomials. However, the polynomial estimators sometimes yield infeasible estimates that are outside the 0â1 range. While none of the estimators are perfectly unbiased, the median estimators match their definition; in simulations their estimates of the proportion have a median error close to zero. The standard median estimator can give unrealistically small estimates (including 0) and an adjustment is proposed that ensures estimates are always credible. This latter estimator has much to recommend it when unbiasedness is not of paramount importance, while the quadrature method is recommended when bias is the dominant issue
Eliciting Dirichlet and Gaussian copula prior distributions for multinomial models
In this paper, we propose novel methods of quantifying expert opinion about prior distributions for multinomial models. Two different multivariate priors are elicited using median and quartile assessments of the multinomial probabilities. First, we start by eliciting a univariate beta distribution for the probability of each category. Then we elicit the hyperparameters of the Dirichlet distribution, as a tractable conjugate prior, from those of the univariate betas through various forms of reconciliation using least-squares techniques. However, a multivariate copula function will give a more flexible correlation structure between multinomial parameters if it is used as their multivariate prior distribution. So, second, we use beta marginal distributions to construct a Gaussian copula as a multivariate normal distribution function that binds these marginals and expresses the dependence structure between them. The proposed method elicits a positive-definite correlation matrix of this Gaussian copula. The two proposed methods are designed to be used through interactive graphical software written in Java
Prevalence and Population Attributable Risk for Chronic Airflow Obstruction in a Large Multinational Study
Rationale: The Global Burden of Disease programme identified smoking, and ambient and household air pollution as the main drivers of death and disability from Chronic Obstructive Pulmonary Disease (COPD).Objective: To estimate the attributable risk of chronic airflow obstruction (CAO), a quantifiable characteristic of COPD, due to several risk factors.Methods: The Burden of Obstructive Lung Disease study is a cross-sectional study of adults, agedâ„40, in a globally distributed sample of 41 urban and rural sites. Based on data from 28,459 participants, we estimated the prevalence of CAO, defined as a post-bronchodilator one-second forced expiratory volume to forced vital capacity ratio Measurements and Main Results: Mean prevalence of CAO was 11.2% in men and 8.6% in women. Mean PAR for smoking was 5.1% in men and 2.2% in women. The next most influential risk factors were poor education levels, working in a dusty job for â„10 years, low body mass index (BMI), and a history of tuberculosis. The risk of CAO attributable to the different risk factors varied across sites.Conclusions: While smoking remains the most important risk factor for CAO, in some areas poor education, low BMI and passive smoking are of greater importance. Dusty occupations and tuberculosis are important risk factors at some sites
Eliciting prior distributions for extra parameters in some generalized linear models
To elicit an informative prior distribution for a normal linear model or a gamma generalized linear model (GLM), expert opinion must be quantified about both the regression coefficients and the extra parameters of these models. The latter task has attracted comparatively little attention. In this article, we introduce two elicitation methods that aim to complete the prior structure of the normal and gamma GLMs. First, we develop a method of assessing a conjugate prior distribution for the error variance in normal linear models. The method quantifies an expert's opinions through assessments of a median and conditional medians. Second, we propose a novel method for eliciting a lognormal prior distribution for the scale parameter of gamma GLMs. Given the mean value of a gamma distributed response variable, the method is based on conditional quartile assessments. It can also be used to quantify an expert's opinion about the prior distribution for the shape parameter of any gamma random variable, if the mean of the distribution has been elicited or is assumed to be known. In the context of GLMs, the mean value is determined by the regression coefficients. Interactive graphics is the medium through which assessments for the two proposed methods are elicited. Examples illustrating use of the methods are given. Computer programs that implement both methods are available
Eliciting Dirichlet and ConnorâMosimann prior distributions for multinomial models
This paper addresses the task of eliciting an informative prior distribution for multinomial models. We first introduce a method of eliciting univariate beta distributions for the probability of each category, conditional on the probabilities of other categories. Two different forms of multivariate prior are derived from the elicited beta distributions. First, we determine the hyperparameters of a Dirichlet distribution by reconciling the assessed parameters of the univariate beta conditional distributions. Although the Dirichlet distribution is the standard conjugate prior distribution for multinomial models, it is not flexible enough to represent a broad range of prior information. Second, we use the beta distributions to determine the parameters of a ConnorâMosimann distribution, which is a generalization of a Dirichlet distribution and is also a conjugate prior for multinomial models. It has a larger number of parameters than the standard Dirichlet distribution and hence a more flexible structure. The elicitation methods are designed to be used with the aid of interactive graphical user-friendly software
Two-Term Edgeworth Expansions for the Classes of U- and V-statistics
Much effort has been devoted to deriving Edgeworth expansions for various classes of statistics that are asymptotically normally distributed, with derivations tailored to the individual structure of each class. Expansions with smaller error rates are needed for more accurate statistical inference. Two such Edgeworth expansions are derived analytically in this paper. One is a two-term expansion for the standardized U-statistic of order m, m â©Ÿ 3, with an error rate o(nâ 1). The other is an expansion with the same error rate for the distribution of the standardized V-statistic of the same order. In deriving the Edgeworth expansion, we made use of the close connection between the V- and U-statistics, which permits to first derive the needed expansion for the related U-statistic, then extend it to the V-statistic, taking into consideration the estimation of all difference terms between the two statistics
Prior distribution elicitation for generalized linear and piecewise-linear models
An elicitation method is proposed for quantifying subjective opinion about the regression coefficients of a generalized linear model. Opinion between a continuous predictor variable and the dependent variable is modelled by a piecewise-linear function, giving a flexible model that can represent a wide variety of opinion. To quantify his or her opinions, the expert uses an interactive computer program, performing assessment tasks that involve drawing graphs and bar-charts to specify medians and other quantiles. Opinion about the regression coefficients is represented by a multivariate normal distribution whose parameters are determined from the assessments. It is practical to use the procedure with models containing a large number of parameters. This is illustrated through practical examples and the benefit from using prior knowledge is examined through cross-validation
Recommended from our members
Locally correct confidence intervals for a binomial proportion: A new criteria for an interval estimator
Well-recommended methods of forming âconfidence intervalsâ for a binomial proportion give interval estimates that do not actually meet the definition of a confidence interval, in that their coverages are sometimes lower than the nominal confidence level. The methods are favoured because their intervals have a shorter average length than the Clopper-Pearson (gold-standard) method, whose intervals really are confidence intervals. As the definition of a confidence interval is not being adhered to, another criterion for forming interval estimates for a binomial proportion is needed. In this paper we suggest a new criterion for forming one-sided intervals and equal-tail two-sided intervals. Methods which meet the criterion are said to yield locally correct confidence intervals. We propose a method that yields such intervals and prove that its intervals have a shorter average length than those of any other method that meets the criterion. Compared with the Clopper-Pearson method, the proposed method gives intervals with an appreciably smaller average length. For confidence levels of practical interest, the mid-p method also satisfies the new criterion and has its own optimality property. It gives locally correct confidence intervals that are only slightly wider than those of the new method