Search CORE

37,780 research outputs found

Bayesian model selection in logistic regression for the detection of adverse drug reactions

Author: Marbac Matthieu
Sedki Mohammed
Tubert-Bitter Pascale
Publication venue
Publication date: 08/04/2015
Field of study

Motivation: Spontaneous adverse event reports have a high potential for detecting adverse drug reactions. However, due to their dimension, exploring such databases requires statistical methods. In this context, disproportionality measures are used. However, by projecting the data onto contingency tables, these methods become sensitive to the problem of co-prescriptions and masking effects. Recently, logistic regressions have been used with a Lasso type penalty to perform the detection of associations between drugs and adverse events. However, the choice of the penalty value is open to criticism while it strongly influences the results. Results: In this paper, we propose to use a logistic regression whose sparsity is viewed as a model selection challenge. Since the model space is huge, a Metropolis-Hastings algorithm carries out the model selection by maximizing the BIC criterion. Thus, we avoid the calibration of penalty or threshold. During our application on the French pharmacovigilance database, the proposed method is compared to well established approaches on a reference data set, and obtains better rates of positive and negative controls. However, many signals are not detected by the proposed method. So, we conclude that this method should be used in parallel to existing measures in pharmacovigilance.Comment: 7 pages, 3 figures, submitted to Biometrical Journa

arXiv.org e-Print Archive

Crossref

HAL-Pasteur

HAL UVSQ

Asymptotic inference for semiparametric association models

Author: Osius Gerhard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 04/03/2009
Field of study

Association models for a pair of random elements

X

and

Y

(e.g., vectors) are considered which specify the odds ratio function up to an unknown parameter \bolds\theta. These models are shown to be semiparametric in the sense that they do not restrict the marginal distributions of

X

and

Y

. Inference for the odds ratio parameter \bolds\theta may be obtained from sampling either

Y

conditionally on

X

or vice versa. Generalizing results from Prentice and Pyke, Weinberg and Wacholder and Scott and Wild, we show that asymptotic inference for \bolds\theta under sampling conditional on

Y

is the same as if sampling had been conditional on

X

. Common regression models, for example, generalized linear models with canonical link or multivariate linear, respectively, logistic models, are association models where the regression parameter \bolds\beta is closely related to the odds ratio parameter \bolds\theta. Hence inference for \bolds\beta may be drawn from samples conditional on

Y

using an association model.Comment: Published in at http://dx.doi.org/10.1214/07-AOS572 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

On the Bayes-optimality of F-measure maximizers

Author: Cheng Weiwei
Dembczynski Krzysztof
Hullermeier Eyke
Jachnik Arkadiusz
Waegeman Willem
Publication venue
Publication date: 01/01/2014
Field of study

The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists. Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. We start with a Bayes-risk analysis of related loss functions, such as Hamming loss and subset zero-one loss, showing that optimizing such losses as a surrogate of the F-measure leads to a high worst-case regret. Subsequently, we perform a similar type of analysis for F-measure maximizing algorithms, showing that such algorithms are approximate, while relying on additional assumptions regarding the statistical distribution of the binary response variables. Furthermore, we present a new algorithm which is not only computationally efficient but also Bayes-optimal, regardless of the underlying distribution. To this end, the algorithm requires only a quadratic (with respect to the number of binary responses) number of parameters of the joint distribution. We illustrate the practical performance of all analyzed methods by means of experiments with multi-label classification problems

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Archivsystem Ask23