8 research outputs found

    Missing data management in epidemiology : Application of multiple imputation to data from surveillance systems and surveys

    No full text
    Le traitement des donnĂ©es manquantes est un sujet en pleine expansion en Ă©pidĂ©miologie. La mĂ©thode la plus souvent utilisĂ©e restreint les analyses aux sujets ayant des donnĂ©es complĂštes pour les variables d’intĂ©rĂȘt, ce qui peut rĂ©duire lapuissance et la prĂ©cision et induire des biais dans les estimations. L’objectif de ce travail a Ă©tĂ© d’investiguer et d’appliquer une mĂ©thode d’imputation multiple Ă  des donnĂ©es transversales d’enquĂȘtes Ă©pidĂ©miologiques et de systĂšmes de surveillance de maladies infectieuses. Nous avons prĂ©sentĂ© l’application d’une mĂ©thode d’imputation multiple Ă  des Ă©tudes de schĂ©mas diffĂ©rents : une analyse de risque de transmission du VIH par transfusion, une Ă©tude cas-tĂ©moins sur les facteurs de risque de l’infection Ă  Campylobacter et une Ă©tude capture-recapture estimant le nombre de nouveaux diagnostics VIH chez les enfants. A partir d’une base de donnĂ©es de surveillance de l’hĂ©patite C chronique (VHC), nous avons rĂ©alisĂ© une imputation des donnĂ©es manquantes afind’identifier les facteurs de risque de complications hĂ©patiques graves chez des usagers de drogue. A partir des mĂȘmes donnĂ©es, nous avons proposĂ© des critĂšres d’application d’une analyse de sensibilitĂ© aux hypothĂšses sous-jacentes Ă l’imputation multiple. Enfin, nous avons dĂ©crit l’élaboration d’un processus d’imputation pĂ©renne appliquĂ© aux donnĂ©es du systĂšme de surveillance du VIH et son Ă©volution au cours du temps, ainsi que les procĂ©dures d’évaluation et devalidation.Les applications pratiques prĂ©sentĂ©es nous ont permis d’élaborer une stratĂ©gie de traitement des donnĂ©es manquantes, incluant l’examen approfondi de la base de donnĂ©es incomplĂšte, la construction du modĂšle d’imputation multiple, ainsi queles Ă©tapes de validation des modĂšles et de vĂ©rification des hypothĂšses.The management of missing values is a common and widespread problem in epidemiology. The most common technique used restricts the data analysis to subjects with complete information on variables of interest, which can reducesubstantially statistical power and precision and may also result in biased estimates.This thesis investigates the application of multiple imputation methods to manage missing values in epidemiological studies and surveillance systems for infectious diseases. Study designs to which multiple imputation was applied were diverse: a risk analysis of HIV transmission through blood transfusion, a case-control study on risk factors for ampylobacter infection, and a capture-recapture study to estimate the number of new HIV diagnoses among children. We then performed multiple imputation analysis on data of a surveillance system for chronic hepatitis C (HCV) to assess risk factors of severe liver disease among HCV infected patients who reported drug use. Within this study on HCV, we proposedguidelines to apply a sensitivity analysis in order to test the multiple imputation underlying hypotheses. Finally, we describe how we elaborated and applied an ongoing multiple imputation process of the French national HIV surveillance database, evaluated and attempted to validate multiple imputation procedures.Based on these practical applications, we worked out a strategy to handle missing data in surveillance data base, including the thorough examination of the incomplete database, the building of the imputation model, and the procedure to validate imputation models and examine underlying multiple imputation hypotheses

    Traitement des donnĂ©es manquantes en Ă©pidĂ©miologie : application de l’imputation multiple Ă  des donnĂ©es de surveillance et d’enquĂȘtes

    No full text
    The management of missing values is a common and widespread problem in epidemiology. The most common technique used restricts the data analysis to subjects with complete information on variables of interest, which can reducesubstantially statistical power and precision and may also result in biased estimates.This thesis investigates the application of multiple imputation methods to manage missing values in epidemiological studies and surveillance systems for infectious diseases. Study designs to which multiple imputation was applied were diverse: a risk analysis of HIV transmission through blood transfusion, a case-control study on risk factors for ampylobacter infection, and a capture-recapture study to estimate the number of new HIV diagnoses among children. We then performed multiple imputation analysis on data of a surveillance system for chronic hepatitis C (HCV) to assess risk factors of severe liver disease among HCV infected patients who reported drug use. Within this study on HCV, we proposedguidelines to apply a sensitivity analysis in order to test the multiple imputation underlying hypotheses. Finally, we describe how we elaborated and applied an ongoing multiple imputation process of the French national HIV surveillance database, evaluated and attempted to validate multiple imputation procedures.Based on these practical applications, we worked out a strategy to handle missing data in surveillance data base, including the thorough examination of the incomplete database, the building of the imputation model, and the procedure to validate imputation models and examine underlying multiple imputation hypotheses.Le traitement des donnĂ©es manquantes est un sujet en pleine expansion en Ă©pidĂ©miologie. La mĂ©thode la plus souvent utilisĂ©e restreint les analyses aux sujets ayant des donnĂ©es complĂštes pour les variables d’intĂ©rĂȘt, ce qui peut rĂ©duire lapuissance et la prĂ©cision et induire des biais dans les estimations. L’objectif de ce travail a Ă©tĂ© d’investiguer et d’appliquer une mĂ©thode d’imputation multiple Ă  des donnĂ©es transversales d’enquĂȘtes Ă©pidĂ©miologiques et de systĂšmes de surveillance de maladies infectieuses. Nous avons prĂ©sentĂ© l’application d’une mĂ©thode d’imputation multiple Ă  des Ă©tudes de schĂ©mas diffĂ©rents : une analyse de risque de transmission du VIH par transfusion, une Ă©tude cas-tĂ©moins sur les facteurs de risque de l’infection Ă  Campylobacter et une Ă©tude capture-recapture estimant le nombre de nouveaux diagnostics VIH chez les enfants. A partir d’une base de donnĂ©es de surveillance de l’hĂ©patite C chronique (VHC), nous avons rĂ©alisĂ© une imputation des donnĂ©es manquantes afind’identifier les facteurs de risque de complications hĂ©patiques graves chez des usagers de drogue. A partir des mĂȘmes donnĂ©es, nous avons proposĂ© des critĂšres d’application d’une analyse de sensibilitĂ© aux hypothĂšses sous-jacentes Ă l’imputation multiple. Enfin, nous avons dĂ©crit l’élaboration d’un processus d’imputation pĂ©renne appliquĂ© aux donnĂ©es du systĂšme de surveillance du VIH et son Ă©volution au cours du temps, ainsi que les procĂ©dures d’évaluation et devalidation.Les applications pratiques prĂ©sentĂ©es nous ont permis d’élaborer une stratĂ©gie de traitement des donnĂ©es manquantes, incluant l’examen approfondi de la base de donnĂ©es incomplĂšte, la construction du modĂšle d’imputation multiple, ainsi queles Ă©tapes de validation des modĂšles et de vĂ©rification des hypothĂšses

    Syndromic Surveillance of Acute Liver Failure in Emergency Departments (France, 2010-2012)

    No full text
    Our objectives were to explore the relevance of emergency departments' (ED) data, collected daily through the French syndromic surveillance system (414 EDs, 65% attendances), to describe the characteristics of patients with acute liver failure (ALF). Data corresponding to ICD10 codes related to hepatitis diagnosis that include ALF ICD10 code (K720) were extracted and analyzed. During 2010-2012, 246 730 attendances with hepatitis were recorded of which 2 475 (1%) were linked to ALF. Patients with ALF were male (60%), their median age was 55 years. This study shows the relevance of French syndromic surveillance data to assess the burden of ALF

    Syndromic Surveillance of Acute Liver Failure in Emergency Departments (France, 2010-2012)

    Get PDF
    Our objectives were to explore the relevance of emergency departments' (ED) data, collected daily through the French syndromic surveillance system (414 EDs, 65% attendances), to describe the characteristics of patients with acute liver failure (ALF). Data corresponding to ICD10 codes related to hepatitis diagnosis that include ALF ICD10 code (K720) were extracted and analyzed. During 2010-2012, 246 730 attendances with hepatitis were recorded of which 2 475 (1%) were linked to ALF. Patients with ALF were male (60%), their median age was 55 years. This study shows the relevance of French syndromic surveillance data to assess the burden of ALF

    A three-source capture-recapture estimate of the number of new HIV diagnoses in children in France from 2003--2006 with multiple imputation of a variable of heterogeneous catchability.

    Get PDF
    International audienceABSTRACT: BACKGROUND: Nearly all HIV infections in children worldwide are acquired through mother-to-child transmission (MTCT) during pregnancy, labour, delivery or breastfeeding. The objective of our study was to estimate the number and rate of new HIV diagnoses in children less than 13 years of age in mainland France from 2003--2006. METHODS: We performed a capture-recapture analysis based on three sources of information: the mandatory HIV case reporting (DOVIH), the French Perinatal Cohort (ANRS-EPF) and a laboratory-based surveillance of HIV (LaboVIH). The missing values of a variable of heterogeneous catchability were estimated through multiple imputation. Log-linear modelling provided estimates of the number of new HIV infections in children, taking into account dependencies between sources and variables of heterogeneous catchability. RESULTS: The three sources observed 216 new HIV diagnoses after record-linkage. The number of new HIV diagnoses in children was estimated at 387 (95%CI [271--503]) from 2003--2006, among whom 60% were born abroad. The estimated rate of new HIV diagnoses in children in mainland France was 9.1 per million in 2006 and was 38 times higher in children born abroad than in those born in France. The estimated completeness of the three sources combined was 55.8% (95% CI [42.9 -- 79.7]) and varied according to the source; the completeness of DOVIH (28.4%) and ANRS-EPF (26.1%) were lower than that of LaboVIH (33.3%). CONCLUSION: Our study provided, for the first time, an estimated annual rate of new HIV diagnoses in children under 13 years old in mainland France. A more systematic HIV screening of pregnant women that is repeated during pregnancy among women likely to engage in risky behaviour is needed to optimise the prevention of MTCT. HIV screening for children who migrate from countries with high HIV prevalence to France could be recommended to facilitate early diagnosis and treatment

    Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the method proposed by Carpenter <it>et al.</it> (2007).</p> <p>This approach aims to approximate inferences under a Missing Not At random (MNAR) mechanism by reweighting estimates obtained after multiple imputation where the weights depend on the assumed degree of departure from the MAR assumption.</p> <p>Methods</p> <p>The method is illustrated with epidemiological data from a surveillance system of hepatitis C virus (HCV) infection in France during the 2001–2007 period. The subpopulation studied included 4343 HCV infected patients who reported drug use. Risk factors for severe liver disease were assessed. After performing complete-case and multiple imputation analyses, we applied the sensitivity analysis to 3 risk factors of severe liver disease: past excessive alcohol consumption, HIV co-infection and infection with HCV genotype 3.</p> <p>Results</p> <p>In these data, the association between severe liver disease and HIV was underestimated, if given the observed data the chance of observing HIV status is high when this is positive. Inference for two other risk factors were robust to plausible local departures from the MAR assumption.</p> <p>Conclusions</p> <p>We have demonstrated the practical utility of, and advocate, a pragmatic widely applicable approach to exploring plausible departures from the MAR assumption post multiple imputation. We have developed guidelines for applying this approach to epidemiological studies.</p
    corecore