10 research outputs found

    Bayesian fractional polynomials

    Get PDF
    This paper sets out to implement the Bayesian paradigm for fractional polynomial models under the assumption of normally distributed error terms. Fractional polynomials widen the class of ordinary polynomials and offer an additive and transportable modelling approach. The methodology is based on a Bayesian linear model with a quasi-default hyper-g prior and combines variable selection with parametric modelling of additive effects. AMarkov chain Monte Carlo algorithm for the exploration of the model space is presented. This theoretically well-founded stochastic search constitutes a substantial improvement to ad hoc stepwise procedures for the fitting of fractional polynomial models. The method is applied to a data set on the relationship between ozone levels and meteorological parameters, previously analysed in the literatur

    Hyper-g Priors for Generalized Linear Models

    Full text link
    We develop an extension of the classical Zellner's g-prior to generalized linear models. The prior on the hyperparameter g is handled in a flexible way, so that any continuous proper hyperprior f(g) can be used, giving rise to a large class of hyper-g priors. Connections with the literature are described in detail. A fast and accurate integrated Laplace approximation of the marginal likelihood makes inference in large model spaces feasible. For posterior parameter estimation we propose an efficient and tuning-free Metropolis-Hastings sampler. The methodology is illustrated with variable selection and automatic covariate transformation in the Pima Indians diabetes data set.Comment: 30 pages, 12 figures, poster contribution at ISBA 201

    Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

    Full text link
    Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix

    Bayesian Fractional Polynomial Approach to Quantile Regression and Variable Selection with Application in the Analysis of Blood Pressure among US Adults

    Full text link
    Hypertension is a highly prevalent chronic medical condition and a strong risk factor for cardiovascular disease (CVD), as it accounts for more than 45%45\% of CVD. The relation between blood pressure (BP) and its risk factors cannot be explored clearly by standard linear models. Although the fractional polynomials (FPs) can act as a concise and accurate formula for examining smooth relationships between response and predictors, modelling conditional mean functions observes the partial view of a distribution of response variable, as the distributions of many response variables such as BP measures are typically skew. Then modelling 'average' BP may link to CVD but extremely high BP could explore CVD insight deeply and precisely. So, existing mean-based FP approaches for modelling the relationship between factors and BP cannot answer key questions in need. Conditional quantile functions with FPs provide a comprehensive relationship between the response variable and its predictors, such as median and extremely high BP measures that may be often required in practical data analysis generally. To the best of our knowledge, this is new in the literature. Therefore, in this paper, we employ Bayesian variable selection with quantile-dependent prior for the FP model to propose a Bayesian variable selection with parametric nonlinear quantile regression model. The objective is to examine a nonlinear relationship between BP measures and their risk factors across median and upper quantile levels using data extracted from the 2007-2008 National Health and Nutrition Examination Survey (NHANES). The variable selection in the model analysis identified that the nonlinear terms of continuous variables (body mass index, age), and categorical variables (ethnicity, gender and marital status) were selected as important predictors in the model across all quantile levels

    Model Averaging and its Use in Economics

    Get PDF
    The method of model averaging has become an important tool to deal with model uncertainty, for example in situations where a large amount of different theories exist, as are common in economics. Model averaging is a natural and formal response to model uncertainty in a Bayesian framework, and most of the paper deals with Bayesian model averaging. The important role of the prior assumptions in these Bayesian procedures is highlighted. In addition, frequentist model averaging methods are also discussed. Numerical methods to implement these methods are explained, and I point the reader to some freely available computational resources. The main focus is on uncertainty regarding the choice of covariates in normal linear regression models, but the paper also covers other, more challenging, settings, with particular emphasis on sampling models commonly used in economics. Applications of model averaging in economics are reviewed and discussed in a wide range of areas, among which growth economics, production modelling, finance and forecasting macroeconomic quantities.Comment: forthcoming; accepted versio

    Depth and Depth-Based Classification with R Package ddalpha

    Get PDF
    Following the seminal idea of Tukey (1975), data depth is a function that measures how close an arbitrary point of the space is located to an implicitly defined center of a data cloud. Having undergone theoretical and computational developments, it is now employed in numerous applications with classification being the most popular one. The R package ddalpha is a software directed to fuse experience of the applicant with recent achievements in the area of data depth and depth-based classification. ddalpha provides an implementation for exact and approximate computation of most reasonable and widely applied notions of data depth. These can be further used in the depth-based multivariate and functional classifiers implemented in the package, where the DDα-procedure is in the main focus. The package is expandable with user-defined custom depth methods and separators. The implemented functions for depth visualization and the built-in benchmark procedures may also serve to provide insights into the geometry of the data and the quality of pattern recognition

    Model Averaging and its Use in Economics

    Get PDF
    The method of model averaging has become an important tool to deal with model uncertainty, in particular in empirical settings with large numbers of potential models and relatively limited numbers of observations, as are common in economics. Model averaging is a natural response to model uncertainty in a Bayesian framework, so most of the paper deals with Bayesian model averaging. In addition, frequentist model averaging methods are also discussed. Numerical methods to implement these methods are explained, and I point the reader to some freely available computational resources. The main focus is on the problem of variable selection in linear regression models, but the paper also discusses other, more challenging, settings. Some of the applied literature is reviewed with particular emphasis on applications in economics. The role of the prior assumptions in Bayesian procedures is highlighted, and some recommendations for applied users are provide

    Use of the Bayesian family of methods to correct for effects of exposure measurement error in polynomial regression models

    Get PDF
    Measurement error in a continuous exposure, if ignored, may cause bias in the estimation of the relationship between exposure and outcome. This presents a significant challenge for understanding exposure-outcome associations in many areas of research, including economic, social, medical and epidemiological research. The presence of classical, i.e. random, measurement error in a continuous exposure has been shown to lead to underestimation of a simple linear relationship. When the functional form of the exposure within a regression model is not linear, i.e. when transformations of the exposure are included, measurement error obscures the true shape of the relationship by making the association appear more linear. Bias in this case will be unknown in direction and vary by exposure level. The most commonly used method for measurement error correction is regression calibration, but this requires an approximation for logistic and survival regression models and does not extend easily to more complex error models. This work investigates three methods for measurement error correction from the Bayesian family of methods: Bayesian analysis using Markov chain Monte Carlo (MCMC), integrated nested Laplace approximations (INLA), and multiple imputation (MI). These have been proposed for measurement error correction but have not been extensively compared, extended for use in several important scenarios, or applied to flexible parametric models. The focus on Bayesian methods was motivated by their flexibility to accommodate complex measurement error models and non-linear exposure-outcome associations. Polynomial regression models are widely used and are often the most interpretable models. In order for measurement error correction methods to be widely implemented, they should be able to accommodate known polynomial transformations as well as model selection procedures when the functional form of the error-prone exposure is unknown. Therefore, in this thesis, correction methods are integrated with the fractional polynomial method, a flexible polynomial model-building procedure for positive continuous variables. In this thesis, I perform a large simulation study comparing proposed methods for measurement error correction from the Bayesian family (i.e. MCMC, INLA, and MI) to the most common method of measurement error correction. Extensions of INLA and MI are presented in order to accommodate both a validation study setting wherein the error-free exposure is measured in a subgroup as well as a replicate study setting wherein there are multiple measures of the error-prone exposure. In order to accommodate unknown polynomial transformations of the error-prone variable, two approaches not used before in this context are proposed and explored in simulation studies alongside more standard methods. The first approach uses Bayesian posterior means in lieu of maximum likelihood estimates within regression calibration. The second approach adapts methods of Bayesian variable selection to the selection of the best polynomial transformation of the error-prone exposure while accommodating measurement error. Successful methods are applied to a motivating example, fitting the non-linear association between alcohol intake and all-cause mortality. By combining measurement error correction adaptable to complex error models with polynomial regression inclusive of model-selection, this work fills a niche which will facilitate wider use of measurement error correction techniques

    Proceedings of the 36th International Workshop Statistical Modelling July 18-22, 2022 - Trieste, Italy

    Get PDF
    The 36th International Workshop on Statistical Modelling (IWSM) is the first one held in presence after a two year hiatus due to the COVID-19 pandemic. This edition was quite lively, with 60 oral presentations and 53 posters, covering a vast variety of topics. As usual, the extended abstracts of the papers are collected in the IWSM proceedings, but unlike the previous workshops, this year the proceedings will be not printed on paper, but it is only online. The workshop proudly maintains its almost unique feature of scheduling one plenary session for the whole week. This choice has always contributed to the stimulating atmosphere of the conference, combined with its informal character, encouraging the exchange of ideas and cross-fertilization among different areas as a distinguished tradition of the workshop, student participation has been strongly encouraged. This IWSM edition is particularly successful in this respect, as testified by the large number of students included in the program

    Bayesian fractional polynomials

    Full text link
    This paper sets out to implement the Bayesian paradigm for fractional polynomial models under the assumption of normally distributed error terms. Fractional polynomials widen the class of ordinary polynomials and offer an additive and transportable modelling approach. The methodology is based on a Bayesian linear model with a quasi-default hyper-g prior and combines variable selection with parametric modelling of additive effects. A Markov chain Monte Carlo algorithm for the exploration of the model space is presented. This theoretically well-founded stochastic search constitutes a substantial improvement to ad hoc stepwise procedures for the fitting of fractional polynomial models. The method is applied to a data set on the relationship between ozone levels and meteorological parameters, previously analysed in the literature
    corecore