31 research outputs found

    Bayesian functional linear regression with sparse step functions

    Full text link
    The functional linear regression model is a common tool to determine the relationship between a scalar outcome and a functional predictor seen as a function of time. This paper focuses on the Bayesian estimation of the support of the coefficient function. To this aim we propose a parsimonious and adaptive decomposition of the coefficient function as a step function, and a model including a prior distribution that we name Bayesian functional Linear regression with Sparse Step functions (Bliss). The aim of the method is to recover areas of time which influences the most the outcome. A Bayes estimator of the support is built with a specific loss function, as well as two Bayes estimators of the coefficient function, a first one which is smooth and a second one which is a step function. The performance of the proposed methodology is analysed on various synthetic datasets and is illustrated on a black P\'erigord truffle dataset to study the influence of rainfall on the production

    Antimalarial drug use in general populations of tropical Africa

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The burden of <it>Plasmodium falciparum </it>malaria has worsened because of the emergence of chloroquine resistance. Antimalarial drug use and drug pressure are critical factors contributing to the selection and spread of resistance. The present study explores the geographical, socio-economic and behavioural factors associated with the use of antimalarial drugs in Africa.</p> <p>Methods</p> <p>The presence of chloroquine (CQ), pyrimethamine (PYR) and other antimalarial drugs has been evaluated by immuno-capture and high-performance liquid chromatography in the urine samples of 3,052 children (2–9 y), randomly drawn in 2003 from the general populations at 30 sites in Senegal (10), Burkina-Faso (10) and Cameroon (10). Questionnaires have been administered to the parents of sampled children and to a random sample of households in each site. The presence of CQ in urine was analysed as dependent variable according to individual and site characteristics using a random – effect logistic regression model to take into account the interdependency of observations made within the same site.</p> <p>Results</p> <p>According to the sites, the prevalence rates of CQ and PYR ranged from 9% to 91% and from 0% to 21%, respectively. In multivariate analysis, the presence of CQ in urine was significantly associated with a history of fever during the three days preceding urine sampling (OR = 1.22, p = 0.043), socio-economic level of the population of the sites (OR = 2.74, p = 0.029), age (2–5 y = reference level; 6–9 y OR = 0.76, p = 0.002), prevalence of anti-circumsporozoite protein (CSP) antibodies (low prevalence: reference level; intermediate level OR = 2.47, p = 0.023), proportion of inhabitants who lived in another site one year before (OR = 2.53, p = 0.003), and duration to reach the nearest tarmacked road (duration less than one hour = reference level, duration equal to or more than one hour OR = 0.49, p = 0.019).</p> <p>Conclusion</p> <p>Antimalarial drug pressure varied considerably from one site to another. It was significantly higher in areas with intermediate malaria transmission level and in the most accessible sites. Thus, <it>P. falciparum </it>strains arriving in cross-road sites or in areas with intermediate malaria transmission are exposed to higher drug pressure, which could favour the selection and the spread of drug resistance.</p

    Sélection bayésienne de variables et méthodes de type Parallel Tempering avec et sans vraisemblance

    No full text
    Cette thèse se décompose en deux parties. Dans un premier temps nous nous intéressons à la sélection bayésienne de variables dans un modèle probit mixte.L'objectif est de développer une méthode pour sélectionner quelques variables pertinentes parmi plusieurs dizaines de milliers tout en prenant en compte le design d'une étude, et en particulier le fait que plusieurs jeux de données soient fusionnés. Le modèle de régression probit mixte utilisé fait partie d'un modèle bayésien hiérarchique plus large et le jeu de données est considéré comme un effet aléatoire. Cette méthode est une extension de la méthode de Lee et al. (2003). La première étape consiste à spécifier le modèle ainsi que les distributions a priori, avec notamment l'utilisation de l'a priori conventionnel de Zellner (g-prior) pour le vecteur des coefficients associé aux effets fixes (Zellner, 1986). Dans une seconde étape, nous utilisons un algorithme Metropolis-within-Gibbs couplé à la grouping (ou blocking) technique de Liu (1994) afin de surmonter certaines difficultés d'échantillonnage. Ce choix a des avantages théoriques et computationnels. La méthode développée est appliquée à des jeux de données microarray sur le cancer du sein. Cependant elle a une limite : la matrice de covariance utilisée dans le g-prior doit nécessairement être inversible. Or il y a deux cas pour lesquels cette matrice est singulière : lorsque le nombre de variables sélectionnées dépasse le nombre d'observations, ou lorsque des variables sont combinaisons linéaires d'autres variables. Nous proposons donc une modification de l'a priori de Zellner en y introduisant un paramètre de type ridge, ainsi qu'une manière de choisir les hyper-paramètres associés. L'a priori obtenu est un compromis entre le g-prior classique et l'a priori supposant l'indépendance des coefficients de régression, et se rapproche d'un a priori précédemment proposé par Gupta et Ibrahim (2007).Dans une seconde partie nous développons deux nouvelles méthodes MCMC basées sur des populations de chaînes. Dans le cas de modèles complexes ayant de nombreux paramètres, mais où la vraisemblance des données peut se calculer, l'algorithme Equi-Energy Sampler (EES) introduit par Kou et al. (2006) est apparemment plus efficace que l'algorithme classique du Parallel Tempering (PT) introduit par Geyer (1991). Cependant, il est difficile d'utilisation lorsqu'il est couplé avec un échantillonneur de Gibbs, et nécessite un stockage important de valeurs. Nous proposons un algorithme combinant le PT avec le principe d'échanges entre chaînes ayant des niveaux d'énergie similaires dans le même esprit que l'EES. Cette adaptation appelée Parallel Tempering with Equi-Energy Moves (PTEEM) conserve l'idée originale qui fait la force de l'algorithme EES tout en assurant de bonnes propriétés théoriques et une utilisation facile avec un échantillonneur de Gibbs.Enfin, dans certains cas complexes l'inférence peut être difficile car le calcul de la vraisemblance des données s'avère trop coûteux, voire impossible. De nombreuses méthodes sans vraisemblance ont été développées. Par analogie avec le Parallel Tempering, nous proposons une méthode appelée ABC-Parallel Tempering, basée sur la théorie des MCMC, utilisant une population de chaînes et permettant des échanges entre elles.This thesis is divided into two main parts. In the first part, we propose a Bayesian variable selection method for probit mixed models. The objective is to select few relevant variables among tens of thousands while taking into account the design of a study, and in particular the fact that several datasets are merged together. The probit mixed model used is considered as part of a larger hierarchical Bayesian model, and the dataset is introduced as a random effect. The proposed method extends a work of Lee et al. (2003). The first step is to specify the model and prior distributions. In particular, we use the g-prior of Zellner (1986) for the fixed regression coefficients. In a second step, we use a Metropolis-within-Gibbs algorithm combined with the grouping (or blocking) technique of Liu (1994). This choice has both theoritical and practical advantages. The method developed is applied to merged microarray datasets of patients with breast cancer. However, this method has a limit: the covariance matrix involved in the g-prior should not be singular. But there are two standard cases in which it is singular: if the number of observations is lower than the number of variables, or if some variables are linear combinations of others. In such situations we propose to modify the g-prior by introducing a ridge parameter, and a simple way to choose the associated hyper-parameters. The prior obtained is a compromise between the conditional independent case of the coefficient regressors and the automatic scaling advantage offered by the g-prior, and can be linked to the work of Gupta and Ibrahim (2007).In the second part, we develop two new population-based MCMC methods. In cases of complex models with several parameters, but whose likelihood can be computed, the Equi-Energy Sampler (EES) of Kou et al. (2006) seems to be more efficient than the Parallel Tempering (PT) algorithm introduced by Geyer (1991). However it is difficult to use in combination with a Gibbs sampler, and it necessitates increased storage. We propose an algorithm combining the PT with the principle of exchange moves between chains with same levels of energy, in the spirit of the EES. This adaptation which we are calling Parallel Tempering with Equi-Energy Move (PTEEM) keeps the original idea of the EES method while ensuring good theoretical properties and a practical use in combination with a Gibbs sampler.Then, in some complex models whose likelihood is analytically or computationally intractable, the inference can be difficult. Several likelihood-free methods (or Approximate Bayesian Computational Methods) have been developed. We propose a new algorithm, the Likelihood Free-Parallel Tempering, based on the MCMC theory and on a population of chains, by using an analogy with the Parallel Tempering algorithm

    An overview on approximate bayesian computation

    No full text
    National audienceApproximate Bayesian computation techniques, also called likelihood-free methods, are one of the most satisfactory approach to intractable likelihood problems. This overview presents recent results since its introduction about ten years ago in population genetics.Résumé. Les méthodes bayésiennes approchées constituent l’un des outils majeurs d’inférence statistique en dimension finie lorsque la vraisemblance du modèle paramétrique considéré n’est pas accessible. Nous présentons quelques résultats récents qui ont permis d’augmenter significativement l’efficacité de ces techniques depuis leurs introductions dans le domaine de la génétique des populations il y a maintenant plus d’une dizaine anné

    Comments on "Bayesian variable selection for disease classification using gene expression data

    No full text
    International audienc

    BAYESIAN VARIABLE SELECTION FOR PROBIT MIXED MODELS

    No full text
    An important issue in building a regression model is to select the most pertinent covariables. Several approaches have been proposed to handle this problem. However, it is often useful in a regression framework to take into account some random e ects. In genetics, it is appealing to merge datasets because it results in more observations and diversi es the data, allowing a more reliable selection of gene fragments (otherwise known as &quot;probesets&quot;). But it is then necessary to introduce the dataset as a random e ect. In this article we propose a method to select relevant variables among ten of thousands in a probit mixed regression model, which extends a method developed by Lee et al. (2003). This model is considered as part of a larger hierarchical Bayesian model, and latent variables are used to identify subsets of selected variables. We combine the collapsing technique of Liu (1994) with a Metropolis-within-Gibbs algorithm (Robert and Casella, 2004). The algorithm proposed is quite e cient and feasible, even for very large datasets with around 20000 variables. We illustrate our method with an application to breast cancer, to select probesets characterizing the Estrogen Receptor (ER) hormonal status of patients which come from three di erent merged datasets. Key words: Bayesian selection, selection of covariables, random e ects, probit mixed regression model, collapsing technique, Metropolis-within-Gibbs algorithm 1

    Parallel Tempering sans vraisemblance

    No full text
    International audienc

    Parallel Tempering sans vraisemblance

    No full text
    National audienc
    corecore