65 research outputs found

    Assessing record linkage between health care and Vital Statistics databases using deterministic methods

    Get PDF
    BACKGROUND: We assessed the linkage and correct linkage rate using deterministic record linkage among three commonly used Canadian databases, namely, the population registry, hospital discharge data and Vital Statistics registry. METHODS: Three combinations of four personal identifiers (surname, first name, sex and date of birth) were used to determine the optimal combination. The correct linkage rate was assessed using a unique personal health number available in all three databases. RESULTS: Among the three combinations, the combination of surname, sex, and date of birth had the highest linkage rate of 88.0% and 93.1%, and the second highest correct linkage rate of 96.9% and 98.9% between the population registry and Vital Statistics registry, and between the hospital discharge data and Vital Statistics registry in 2001, respectively. Adding the first name to the combination of the three identifiers above increased correct linkage by less than 1%, but at the cost of lowering the linkage rate almost by 10%. CONCLUSION: Our findings suggest that the combination of surname, sex and date of birth appears to be optimal using deterministic linkage. The linkage and correct linkage rates appear to vary by age and the type of database, but not by sex

    Mortality following Campylobacter infection: a registry-based linkage study

    Get PDF
    BACKGROUND: Campylobacteriosis is one of the most commonly identified causes of bacterial diarrheal disease and a common cause of gastroenteritis in travellers from developed nations. Despite the widespread occurrence, there is little information on Campylobacter mortality. METHODS: Mortality among a cohort of Campylobacter cases were compared with the general population 0–1, 1–3, 3–12 and more than 12 month after the onset of the illness. The cases were sub-grouped according to if they had been infected domestically or abroad. RESULTS: The standardized mortality ratio for cases infected domestically was 2.9 (95% CI: 1.9–4.0) within the first month following the illness. The risk then gradually diminished and approached 1.0 after one year or more have passed since the illness. This initial excess risk was not attributable to any particular age group (such as the oldest). In contrast, for those infected abroad, a lower standardized mortality ratio 0.3 (95% CI: 0.04–0.8) was shown for the first month after diagnosis compared to what would be expected in the general population. CONCLUSION: Infection with Campylobacter is associated with an increased short-term risk of death among those who were infected domestically. On the contrary, for those infected abroad a lower than expected risk of death was evident. We suggest that the explanation behind this is a "healthy traveler effect" among imported cases, and effects of a more frail than average population among domestic cases

    Quality and complexity measures for data linkage and deduplication

    Get PDF
    Summary. Deduplicating one data set or linking several data sets are increasingly important tasks in the data preparation steps of many data mining projects. The aim of such linkages is to match all records relating to the same entity. Research interest in this area has increased in recent years, with techniques originating from statistics, machine learning, information retrieval, and database research being combined and applied to improve the linkage quality, as well as to increase performance and efficiency when linking or deduplicating very large data sets. Different measures have been used to characterise the quality and complexity of data linkage algorithms, and several new metrics have been proposed. An overview of the issues involved in measuring data linkage and deduplication quality and complexity is presented in this chapter. It is shown that measures in the space of record pair comparisons can produce deceptive quality results. Various measures are discussed and recommendations are given on how to assess data linkage and deduplication quality and complexity. Key words: data or record linkage, data integration and matching, deduplication, data mining pre-processing, quality and complexity measures

    Population-based incidence and 5-year survival for hospital-admitted traumatic brain and spinal cord injury, Western Australia, 2003-2008

    Get PDF
    This study aimed at analysing first-time hospitalisations for traumatic brain injury (TBI) and spinal cord injury (SCI) in Western Australia (WA), in terms of socio-demographic profile, cause of injury, relative risks and survival, using tabular and regression analyses of linked hospital discharge and mortality census files and comparing results with published standardised mortality rates (SMRs) for TBI. Participants were all 9,114 first hospital admissions for TBI or SCI from 7/2003 to 6/2008, linked to mortality census data through 12/2008, and the main outcome measures were number of cases by cause, SMRs in hospital and post-discharge by year through year 5. Road crashes accounted for 34 % of hospitalised TBI and 52 % of hospitalised SCI. 8,460 live TBI discharges experienced 580 deaths during 24,494 person-years of follow-up. The life-table expectation of deaths in the cohort was 164. Post-discharge SMRs were 7.66 in year 1, 3.86 in year 2 and averaged 2.31 in years 3 through 5. 317 live SCI discharges experienced 18 deaths during 929 years of follow-up. Post-discharge SMRs were 7.36 in year 1 and a fluctuating average of 2.13 in years 2 through 5. Use of data from model systems does not appear to yield biased SMRs. Similarly no systematic variation was observed between all-age studies and the more numerous studies that focused on those aged 14 to 16 and older. Based on two studies, SMRs for TBI, however, may be higher in year 2 post-discharge in Australia than elsewhere. That possibility and its cause warrant exploration. Expanding public TBI/SCI compensation in WA from road crash to all causes might triple TBI compensation and double SCI compensation

    A proposed architecture and method of operation for improving the protection of privacy and confidentiality in disease registers

    Get PDF
    BACKGROUND: Disease registers aim to collect information about all instances of a disease or condition in a defined population of individuals. Traditionally methods of operating disease registers have required that notifications of cases be identified by unique identifiers such as social security number or national identification number, or by ensembles of non-unique identifying data items, such as name, sex and date of birth. However, growing concern over the privacy and confidentiality aspects of disease registers may hinder their future operation. Technical solutions to these legitimate concerns are needed. DISCUSSION: An alternative method of operation is proposed which involves splitting the personal identifiers from the medical details at the source of notification, and separately encrypting each part using asymmetrical (public key) cryptographic methods. The identifying information is sent to a single Population Register, and the medical details to the relevant disease register. The Population Register uses probabilistic record linkage to assign a unique personal identification (UPI) number to each person notified to it, although not necessarily everyone in the entire population. This UPI is shared only with a single trusted third party whose sole function is to translate between this UPI and separate series of personal identification numbers which are specific to each disease register. SUMMARY: The system proposed would significantly improve the protection of privacy and confidentiality, while still allowing the efficient linkage of records between disease registers, under the control and supervision of the trusted third party and independent ethics committees. The proposed architecture could accommodate genetic databases and tissue banks as well as a wide range of other health and social data collections. It is important that proposals such as this are subject to widespread scrutiny by information security experts, researchers and interested members of the general public, alike

    Event-based record linkage in health and aged care services data: a methodological innovation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The interface between acute hospital care and residential aged care has long been recognised as an important issue in aged care services research in Australia. However, existing national data provide very poor information on the movements of clients between the two sectors. Nevertheless, there are national data sets which separately contain data on individuals' hospital episodes and stays in residential aged care, so that linking the two data sets–if feasible–would provide a valuable resource for examining relationships between the two sectors. As neither name nor common person identifiers are available on the data sets, other information needs to be used to link events relating to inter-sector movement.</p> <p>Methods</p> <p>Event-based matching using limited demographic data in conjunction with event dates to match events in two data sets provides a possible method for linking related events. The authors develop a statistical model for examining the likely prevalence of false matches, and consequently the number of true matches, among achieved matches when using anonymous event-based record linkage to identify transition events.</p> <p>Results</p> <p>Theoretical analysis shows that for event-based matching the prevalence of false matches among achieved matches (a) declines as the events of interest become rarer, (b) declines as the number of matches increases, and (c) increases with the size of the population within which matching is taking place. The method also facilitates the examination of the trade-off between false matches and missed matches when relaxing or tightening linkage criteria.</p> <p>Conclusion</p> <p>Event-based record linkage is a method for linking related transition events using event dates and basic demographic variables (other than name or person identifier). The likely extent of false links among achieved links depends on the two event rates, the match rate and population size. Knowing these, it is possible to gauge whether, for a particular study, event-based linkage could provide a useful tool for examining movements. Analysis shows that there is a range of circumstances in which event-based record linkage could be applied to two event-level databases to generate a linked database useful for transition analysis.</p

    A randomised controlled trial to compare opt-in and opt-out parental consent for childhood vaccine safety surveillance using data linkage: study protocol

    Get PDF
    Extent: 10p.Background: The Vaccine Assessment using Linked Data (VALiD) trial compared opt-in and opt-out parental consent for a population-based childhood vaccine safety surveillance program using data linkage. A subsequent telephone interview of all households enrolled in the trial elicited parental intent regarding the return or non-return of reply forms for opt-in and opt-out consent. This paper describes the rationale for the trial and provides an overview of the design and methods. Methods/Design: Single-centre, single-blind, randomised controlled trial (RCT) stratified by firstborn status. Mothers who gave birth at one tertiary South Australian hospital were randomised at six weeks post-partum to receive an opt-in or opt-out reply form, along with information explaining data linkage. The primary outcome at 10 weeks post-partum was parental participation in each arm, as indicated by the respective return or non-return of a reply form (or via telephone or email response). A subsequent telephone interview at 10 weeks post-partum elicited parental intent regarding the return or non-return of the reply form, and attitudes and knowledge about data linkage, vaccine safety, consent preferences and vaccination practices. Enrolment began in July 2009 and 1,129 households were recruited in a three-month period. Analysis has not yet been undertaken. The participation rate and selection bias for each method of consent will be compared when the data are analysed. Discussion: The VALiD RCT represents the first trial of opt-in versus opt-out consent for a data linkage study that assesses consent preferences and intent compared with actual opting in or opting out behaviour, and socioeconomic factors. The limitations to generalisability are discussed.Jesia G Berry, Philip Ryan, Annette J Braunack-Mayer, Katherine M Duszynski, Vicki Xafis, Michael S Gold, the Vaccine Assessment Using Linked Data (VALiD) Working Grou

    The funding and use of high-cost medicines in Australia: the example of anti-rheumatic biological medicines

    Get PDF
    BACKGROUND: Subsidised access to high-cost medicines in Australia is restricted under national programs (the Pharmaceutical Benefits Scheme, PBS, and the Repatriation Pharmaceutical Benefits Scheme, RPBS) with a view to achieving cost-effective use. The aim of this study was to examine the use and associated government cost of biological agents for treating rheumatoid arthritis over the first two years of subsidy, and to compare these data to the predicted outcomes. METHODS: National prescription and expenditure data for the biologicals, etanercept, infliximab, adalimumab, and anakinra were collected and analysed for the period August 2003 to July 2005. Dispensing data on biologicals sorted by the metropolitan, rural and remote zones and by prescriber major specialty were also examined. RESULTS: A total of 27,970 prescriptions for biologicals was reimbursed. The government expenditure was A53.1million,representingonly1953.1 million, representing only 19% of that expected. Almost all prescriptions were reimbursed by the PBS (98%, A52 million) and the remainder by the RPBS. Approximately 62% of the prescriptions were for concessional patients (A$32.9 million). There was considerable variability in the use of biologicals across Australian states and territories, usage roughly correlating with the per capita adjusted number of rheumatologists. The total number of prescriptions continued to increase over the study period. Etanercept was the most highly prescribed agent (74% by number of prescriptions), although its use was beginning to plateau. Use of adalimumab increased steadily. Use of infliximab and anakinra was considerably lower. The resultant health outcomes for individual patients are unknown. Prescribers from capital cities and other metropolitan centres provided a majority of prescriptions of biologicals (89%). CONCLUSION: The overall uptake of biologicals for treating rheumatoid arthritis over the first two years of PBS subsidy was considerably lower than expected. Long-term safety concerns and the expanded clinical uses of these drugs emphasise the need for evaluation. It is essential that there is comprehensive, ongoing analysis of utilisation data, associated expenditure and, importantly, patient outcomes in order to enhance accountability, efficiency and equity of policies that allocate substantial resources to subsidising national access to high-cost medicines
    corecore