58 research outputs found

    Sociodemographic differences in linkage error: An examination of four large-scale datasets

    Get PDF
    © 2018 The Author(s). Background: Record linkage is an important tool for epidemiologists and health planners. Record linkage studies will generally contain some level of residual record linkage error, where individual records are either incorrectly marked as belonging to the same individual, or incorrectly marked as belonging to separate individuals. A key question is whether errors in linkage quality are distributed evenly throughout the population, or whether certain subgroups will exhibit higher rates of error. Previous investigations of this issue have typically compared linked and un-linked records, which can conflate bias caused by record linkage error, with bias caused by missing records (data capture errors). Methods: Four large administrative datasets were individually de-duplicated, with results compared to an available 'gold-standard' benchmark, allowing us to avoid methodological issues with comparing linked and un-linked records. Results were compared by gender, age, geographic remoteness (major cities, regional or remote) and socioeconomic status. Results: Results varied between datasets, and by sociodemographic characteristic. The most consistent findings were worse linkage quality for younger individuals (seen in all four datasets) and worse linkage quality for those living in remote areas (seen in three of four datasets). The linkage quality within sociodemographic categories varied between datasets, with the associations with linkage error reversed across different datasets due to quirks of the specific data collection mechanisms and data sharing practices. Conclusions: These results suggest caution should be taken both when linking younger individuals and those in remote areas, and when analysing linked data from these subgroups. Further research is required to determine the ramifications of worse linkage quality in these subpopulations on research outcomes

    Record linked retrospective cohort study of 4.6 million people exploring ethnic variations in disease: myocardial infarction in South Asians

    Get PDF
    Background Law and policy in several countries require health services to demonstrate that they are promoting racial/ethnic equality. However, suitable and accurate data are usually not available. We demonstrated, using acute myocardial infarction, that linkage techniques can be ethical and potentially useful for this purpose. Methods The linkage was based on probability matching. Encryption of a unique national health identifier (the Community Health Index (CHI)) ensured that information about health status and census-based ethnicity could not be ascribed to an identified individual. We linked information on individual ethnic group from the 2001 Census to Scottish hospital discharge and mortality data. Results Overall, 94% of the 4.9 million census records were matched to a CHI record with an estimated false positive rate of less than 0.1 %, with 84.9 – 87.6% of South Asians being successfully linked. Between April 2001 and December 2003 there were 126 first episodes of acute myocardial infarction (AMI) among South Asians and 30,978 among non-South Asians. The incidence rate ratio was 1.45 (95% CI 1.17, 1.78) for South Asian compared to non-South Asian men and 1.80 (95% CI 1.31, 2.48) for South Asian women. After adjustment for age, sex and any previous admission for diabetes the hazard ratio for death following AMI was 0.59 (95% CI 0.43, 0.81), reflecting better survival among South Asians. Conclusion The technique met ethical, professional and legal concerns about the linkage of census and health data and is transferable internationally wherever the census (or population register) contains ethnic group or race data. The outcome is a retrospective cohort study. Our results point to increased incidence rather than increased case fatality in explaining high CHD mortality rate. The findings open up new methods for researchers and health planners

    On the plausibility of socioeconomic mortality estimates derived from linked data: a demographic approach.

    Get PDF
    BACKGROUND Reliable estimates of mortality according to socioeconomic status play a crucial role in informing the policy debate about social inequality, social cohesion, and exclusion as well as about the reform of pension systems. Linked mortality data have become a gold standard for monitoring socioeconomic differentials in survival. Several approaches have been proposed to assess the quality of the linkage, in order to avoid the misclassification of deaths according to socioeconomic status. However, the plausibility of mortality estimates has never been scrutinized from a demographic perspective, and the potential problems with the quality of the data on the at-risk populations have been overlooked. METHODS Using indirect demographic estimation (i.e., the synthetic extinct generation method), we analyze the plausibility of old-age mortality estimates according to educational attainment in four European data contexts with different quality issues: deterministic and probabilistic linkage of deaths, as well as differences in the methodology of the collection of educational data. We evaluate whether the at-risk population according to educational attainment is misclassified and/or misestimated, correct these biases, and estimate the education-specific linkage rates of deaths. RESULTS The results confirm a good linkage of death records within different educational strata, even when probabilistic matching is used. The main biases in mortality estimates concern the classification and estimation of the person-years of exposure according to educational attainment. Changes in the census questions about educational attainment led to inconsistent information over time, which misclassified the at-risk population. Sample censuses also misestimated the at-risk populations according to educational attainment. CONCLUSION The synthetic extinct generation method can be recommended for quality assessments of linked data because it is capable not only of quantifying linkage precision, but also of tracking problems in the population data. Rather than focusing only on the quality of the linkage, more attention should be directed towards the quality of the self-reported socioeconomic status at censuses, as well as towards the accurate estimation of the at-risk populations

    Modeling rare gene variation to gain insight into the oldest biomarker in autism: construction of the serotonin transporter Gly56Ala knock-in mouse

    Get PDF
    Alterations in peripheral and central indices of serotonin (5-hydroxytryptamine, 5-HT) production, storage and signaling have long been associated with autism. The 5-HT transporter gene (HTT, SERT, SLC6A4) has received considerable attention as a potential risk locus for autism-spectrum disorders, as well as disorders with overlapping symptoms, including obsessive-compulsive disorder (OCD). Here, we review our efforts to characterize rare, nonsynonymous polymorphisms in SERT derived from multiplex pedigrees carrying diagnoses of autism and OCD and present the initial stages of our effort to model one of these variants, Gly56Ala, in vivo. We generated a targeting vector to produce the Gly56Ala substitution in the Slc6a4 locus by homologous recombination. Following removal of a neomycin resistance selection cassette, animals exhibiting germline transmission of the Ala56 variant were bred to establish a breeding colony on a 129S6 background, suitable for initial evaluation of biochemical, physiological and behavioral alterations relative to SERT Gly56 (wildtype) animals. SERT Ala56 mice were achieved and exhibit a normal pattern of transmission. The initial growth and gross morphology of these animals is comparable to wildtype littermate controls. The SERT Ala56 variant can be propagated in 129S6 mice without apparent disruption of fertility and growth. We discuss both the opportunities and challenges that await the physiological/behavioral analysis of Gly56Ala transgenic mice, with particular reference to modeling autism-associated traits
    corecore