6 research outputs found

    Quality of record linkage in a highly automated cancer registry that relies on encrypted identity data

    Get PDF
    Objectives: In the absence of unique ID numbers, cancer and other registries in Germany and elsewhere rely on identity data to link records pertaining to the same patient. These data are often encrypted to ensure privacy. Some record linkage errors unavoidably occur. These errors were quantified for the cancer registry of North Rhine Westphalia which uses encrypted identity data. Methods: A sample of records was drawn from the registry, record linkage information was included. In parallel, plain text data for these records were retrieved to generate a gold standard. Record linkage error frequencies in the cancer registry were determined by comparison of the results of the routine linkage with the gold standard. Error rates were projected to larger registries. Results: In the sample studied, the homonym error rate was 0.015%; the synonym error rate was 0.2%. The F-measure was 0.9921. Projection to larger databases indicated that for a realistic development the homonym error rate will be around 1%, the synonym error rate around 2%. Conclusion: Observed error rates are low. This shows that effective methods to standardize and improve the quality of the input data have been implemented. This is crucial to keep error rates low when the registry’s database grows. The planned inclusion of unique health insurance numbers is likely to further improve record linkage quality. Cancer registration entirely based on electronic notification of records can process large amounts of data with high quality of record linkage

    Pediatr Blood Cancer

    Get PDF
    BackgroundIn this study we aimed to evaluate incidence rates and family risk of the most common childhood cancers, tumours in the central nervous system (CNS) and leukaemia among individuals from Norway and individuals with Scandinavian ancestry living in Utah.MethodsWe used the Utah Population Database and the Norwegian National Population Register linked to Cancer registries to identify cancers in children born between 1966 and 2015 and their first-degree relatives. We calculated incidence rates and hazards ratios.ResultsThe overall incidence of CNS tumours increased with consecutive birth cohorts similarly in Utah and Norway (both p<0.001). Incidence rates of leukaemia were more stable and similar in both Utah and in Norway with 4.6/100,000 person-years among children (<15 years) born in the last cohort. A family history of CNS tumours was significantly associated with risk of childhood CNS tumours in Utah HR= 3.05(95% CI 1.80\u20135.16) and Norway HR= 2.87(95% CI 2.20\u20133.74). In Norway, children with a first-degree relative diagnosed with leukaemia had high risk of leukaemia (HR= 2.39, 95% CI 1.61\u20133.55).ConclusionDespite geographical distance and assumed large life style differences, two genetically linked paediatric populations show similar incidences of CNS tumours and leukaemia in the period 1966\u20132015. CNS tumours and leukaemia aggregated in families in both countries.20202021-08-01T00:00:00ZP30 CA42014/NH/NIH HHS/United StatesHHSN261201800016C/CA/NCI NIH HHS/United StatesNU58DP0063200-01/CC/CDC HHS/United StatesP30 CA042014/CA/NCI NIH HHS/United StatesHHSN261201800016I/CA/NCI NIH HHS/United States32437093PMC7313725835

    An Introduction to Data Linkage

    Get PDF
    This guide is designed to give readers a practical introduction to data linkage and is aimed at researchers who would like to gain an understanding of data linkage techniques, either for the creation or analysis of linked data. It covers data preparation, deterministic and probabilistic linkage methods, and analysis of linked data, with examples relevant to health and other administrative data sources. This guide is relevant for academic researchers in the social and health sciences or those who work for government, survey agencies, official statistics, charities or the private sector

    Doctor of Philosophy

    Get PDF
    dissertationSuccessful molecular diagnosis using an exome sequence hinges on accurate association of damaging variants to the patient's phenotype. Unfortunately, many clinical scenarios (e.g., single affected or small nuclear families) have little power to confidently identify damaging alleles using sequence data alone. Today's diagnostic tools are simply underpowered for accurate diagnosis in these situations, limiting successful diagnoses. In response, clinical genetics relies on candidate-gene and variant lists to limit the search space. Despite their practical utility, these lists suffer from inherent and significant limitations. The impact of false negatives on diagnostic accuracy is considerable because candidate-genes and variants lists are assembled ad hoc, choosing alleles based upon expert knowledge. Alleles not in the list are not considered-ending hope for novel discoveries. Rational alternatives to ad hoc assemblages of candidate lists are thus badly needed. In response, I created Phevor, the Phenotype Driven Variant Ontological Re-ranking tool. Phevor works by combining knowledge resident in biomedical ontologies, like the human phenotype and gene ontologies, with the outputs of variant-interpretation tools such as SIFT, GERP+, Annovar and VAAST. Phevor can then accurately to prioritize candidates identified by third-party variant-interpretation tools in light of knowledge found in the ontologies, effectively bypassing the need for candidate-gene and variant lists. Phevor differs from tools such as Phenomizer and Exomiser, as it does not postulate a set of fixed associations between genes and phenotypes. Rather, Phevor dynamically integrates knowledge resident in multiple bio-ontologies into the prioritization process. This enables Phevor to improve diagnostic accuracy for established diseases and previously undescribed or atypical phenotypes. Inserting known disease-alleles into otherwise healthy exomes benchmarked Phevor. Using the phenotype of the known disease, and the variant interpretation tool VAAST (Variant Annotation, Analysis and Search Tool), Phevor can rank 100% of the known alleles in the top 10 and 80% as the top candidate. Phevor is currently part of the pipeline used to diagnose cases as part the Utah Genome Project. Successful diagnoses of several phenotypes have proven Phevor to be a reliable diagnostic tool that can improve the analysis of any disease-gene search
    corecore