23 research outputs found

    Generalized linear mixed models for binary outcome data with a low proportion of occurrences

    No full text
    Many studies in epidemiology and other fields such as econometrics and social sciences give rise to correlated outcome data (e.g., longitudinal studies, meta-analyses, and multi-centre studies). Parameter estimation of generalized linear mixed models (GLMMs), which are frequently used to perform inference on correlated binary outcomes, is complicated by intractable integrals in the marginal likelihood. Penalized quasi-likelihood (PQL) and maximum likelihood estimation in conjunction with numerical integration via adaptive Gauss-Hermite quadrature (AGHQ) are estimation methods that are commonly used in practice. However, the assessment of the performance of these estimation methods in settings found in practice is incomplete, particularly for binary outcome data with a low proportion of occurrences.To begin with, I considered graphical representations of the distributions of cluster-specific log odds of outcome ensuing from random intercepts logistic models (RILMs) converted to the probability scale with the inverse logit transformation. RILMs are special cases of GLMMs. These representations are helpful to comprehend the implications of RILM parameter values for the distributions of cluster-specific probabilities of outcome. The correspondence of these distributions with beta distributions, also used for random effects models for binary outcomes, was graphically assessed and a generally good agreement was found.Afterwards, I evaluated via a simulation study the performance of the PQL and AGHQ methods in several realistic settings of binary outcome data with a low proportion of occurrences. Different features determining the number of occurrences were considered (number of clusters, cluster size, and probabilities of outcome). The AGHQ method produced nearly unbiased fixed effects estimates, even in challenging settings with low proportions of occurrences or a small sample size, but mean square errors tended to be larger than with PQL for small datasets. Both methods produced biased variance component estimates when the number of clusters was moderate, especially with rarer occurrences.Finally, through further analysis of the simulation results, I assessed if a number of indicators quantifying different aspects of the rarity of the events in a dataset, all measurable in practice, could explain patterns of bias in the parameter estimates. The selected rarity indicators quantify the overall number of events and their distribution across the clusters.Plusieurs études en épidémiologie et autres domaines, tels que les sciences sociales, donnent lieu à des données de réponse corrélées (par exemple, les études longitudinales et multi-centres). L'estimation des paramètres des modèles linéaires généralisés mixtes (MLGM), souvent utilisés pour les données de réponse corrélées, est compliquée par des intégrales sans solution analytique dans la fonction de vraisemblance marginale. La méthode de quasi-vraisemblance pénalisée (QVP) et l'estimation par la maximisation de la vraisemblance conjointement avec la technique d'intégration numérique de quadrature Gauss-Hermite adaptée (QGHA) sont souvent utilisées. Cependant, l'évaluation de la performance de ces méthodes en pratique est incomplète, en particulier pour les données de réponse binaires avec faible proportion d'événements.Dans un premier temps, j'ai considéré la représentation graphique de distributions du logarithme de la cote spécifique à chaque groupe résultant de modèles logistiques avec intercepts aléatoires (MLIA) transformées à l'échelle des probabilités avec la transformation logit inversée. Les MLIA sont des cas particuliers des MLGM. Ces représentations sont utiles pour comprendre les implications des valeurs des paramètres sur la distribution de la probabilité de réponse spécifique à chaque groupe. La correspondance avec la loi bêta a été évaluée graphiquement et une bonne concordance fut observée.Par la suite, j'ai évalué avec une étude de simulations la performance des méthodes QVP et QGHA pour plusieurs cas réalistes de données de réponse binaires avec faible proportion d'événements. Différentes caractéristiques déterminant le nombre d'événements furent considérées (nombre et taille des groupes et probabilités d'événement). La méthode QGHA a produit des valeurs estimées presque sans biais, même dans des situations avec faible proportion d'événements ou petite taille d'échantillon, mais les erreurs quadratiques moyennes étaient souvent plus élevées qu'avec la méthode QVP pour les petits échantillons. Les deux méthodes ont produit des valeurs estimées biaisées pour la composante de variance lorsque le nombre de groupes était modéré, particulièrement lorsque les événements étaient rares.Finalement, j'ai évalué si un nombre d'indicateurs de rareté des événements, tous mesurables en pratique pour un jeu de données, pouvaient expliquer le biais dans les valeurs estimées des paramètres. Les indicateurs sélectionnés quantifient le nombre total d'événements et leur distribution dans les groupes

    Determination of the time-dependent association between ciprofloxacin consumption and ciprofloxacin resistance using a weighted cumulative exposure model compared with standard models.

    No full text
    Objectives: To obtain comprehensive insight into the association of ciprofloxacin use at different times in the past with the current risk of detecting resistance.Methods: This retrospective nested case-control study of ciprofloxacin users used Dutch data from the PHARMO Database Network and one laboratory for the period 2003-14. Cases and controls were selected as patients with an antibiotic susceptibility test (AST) indicating ciprofloxacin resistance or susceptibility, respectively. We performed univariable and multivariable conditional logistic regression analyses, defining time-dependent exposure using standard definitions (current ciprofloxacin use, used 0-30, 31-90, 91-180 and 181-360 days ago) and a flexible weighted cumulative effect (WCE) model with four alternative time windows of past doses (0-30, 0-90, 0-180 and 0-360 days).Results: The study population consisted of 230 cases and 909 controls. Under the standard exposure definitions, the association of ciprofloxacin use with resistance decreased with time [current use: adjusted OR 6.8 (95% CI 3.6-12.4); used 181-360 days ago: 1.3 (0.8-1.9)]. Under the 90 day WCE model (best-fitting model), more recent doses were more strongly associated with resistance than past doses, as was longer or repeated treatment. The 180 day WCE model, which fitted the data equally well, suggested that doses taken 91-180 days ago were also significantly associated with resistance.Conclusions: The estimates for the association between ciprofloxacin use at different times and resistance show that ciprofloxacin prescribers should consider ciprofloxacin use 0-180 days ago to ensure that patients receive suitable treatment. The OR of ciprofloxacin resistance could be reduced by eliminating repeated ciprofloxacin prescription within 180 days and by treating for no longer than necessary.Development and application of statistical models for medical scientific researc

    S5 Table -

    No full text
    S5A Table: Heatmap of gProfiler enrichment of all the HITS identified in the screen. S5B Table: Heatmap of gProfiler enrichment of the upregulated HITS identified in the screen. S5C Table: Heatmap of gProfiler enrichment of the downregulated HITS identified in the screen. (XLSX)</p

    Impact of IBD gene candidate ORFs on the THP-1 transcriptome.

    No full text
    (A) Selected example illustrating impact observed on the transcriptome of THP-1 cells following the expression of IRF5. Each dot represents a single detectable gene in the THP-1 transcriptome. The x-axis shows the log2-transformed median expression across all conditions tested (baseline). The y-axis represents the effect of transduction and expression of a given ORF, as the log2-transformed fold-induction compared to baseline. Skyblue dots represent genes with expression value within expected variation (|Z|≤2), orange dots represent genes suggestively outside the range (|Z|>2) and red dots represent genes outside expected range of variation (|Z|>4). Gray dots are genes with expression value below our detection threshold. Additive effect in log2 correspond to multiplicative effect on the original scale. The fold-change equivalent to a given effect log2-effect x is then: FC = 2x. As an example, an effect of 1 correspond to a FC = 2. (B) Correlation of effect of independent set of replicated expression of IRF5 on THP-1 transcriptome. The x-axis (inner color of dots) and y-axis (border color of dots) show the effect of two independent set of replicated ORFs on the transcriptome, as the log2-transformed fold-induction compared to baseline. Variation between sets of replicates includes effect of independent infection dates, RNA extraction, expression arrays and batches. (C) Impact of the transduction and expression of all 42 IBD gene candidate ORFs on the transcriptome of THP-1 cells. ORFs are ordered by their total number of HITS, with the number of up- and down-regulated HITS illustrated by black and gray, respectively (S2 Table & S1 Appendix). Starred ORFs are previously reported IBD candidate causal genes.</p

    Impact of PG receptor agonists on the expression of <i>S100A8/A9</i> in response to LPS.

    No full text
    Relative in mRNA expression levels of S100A8/A9 genes were evaluated after incubating THP-1 for 24 hours with or without 0.2 ug/ml of LPS in the presence or absence of 1x10-5 M of Beraprost or CAY10684, the agonists of PTGIR and PTGER4 respectively. Graph on the right represents the same data with different y-axis scale. Each bar is the mean of 3 samples from 3 different experiments ±SEM. *P P P t-test unpaired).</p
    corecore