42,692 research outputs found

    Are routinely collected NHS administrative records suitable for endpoint identification in clinical trials? Evidence from the West of Scotland coronary prevention study

    Get PDF
    Background: Routinely collected electronic patient records are already widely used in epidemiological research. In this work we investigated the potential for using them to identify endpoints in clinical trials.<p></p> Methods: The events recorded in the West of Scotland Coronary Prevention Study (WOSCOPS), a large clinical trial of pravastatin in middle-aged hypercholesterolaemic men in the 1990s, were compared with those in the record-linked deaths and hospitalisations records routinely collected in Scotland.<p></p> Results: We matched 99% of fatal study events by date. We showed excellent matching (97%) of the causes of fatal endpoint events and good matching (.80% for first events) of the causes of nonfatal endpoint events with a slightly lower rate of mismatching of record linkage than study events (19% of first study myocardial infarctions (MI) and 4% of first record linkage MIs not matched as MI). We also investigated the matching of non-endpoint events and showed a good level of matching, with .78% of first stroke/TIA events being matched as stroke/TIA. The primary reasons for mismatches were record linkage data recording readmissions for procedures or previous events, differences between the diagnoses in the routinely collected data and the conclusions of the clinical trial expert adjudication committee, events occurring outside Scotland and therefore being missed by record linkage data, miscoding of cardiac events in hospitalisations data as ‘unspecified chest pain’, some general miscoding in the record linkage data and some record linkage errors.<p></p> Conclusions: We conclude that routinely collected data could be used for recording cardiovascular endpoints in clinical trials and would give very similar results to rigorously collected clinical trial data, in countries with unified health systems such as Scotland. The endpoint types would need to be carefully thought through and an expert endpoint adjudication committee should be involved.<p></p&gt

    Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation

    Full text link
    The growing expanse of e-commerce and the widespread availability of online databases raise many fears regarding loss of privacy and many statistical challenges. Even with encryption and other nominal forms of protection for individual databases, we still need to protect against the violation of privacy through linkages across multiple databases. These issues parallel those that have arisen and received some attention in the context of homeland security. Following the events of September 11, 2001, there has been heightened attention in the United States and elsewhere to the use of multiple government and private databases for the identification of possible perpetrators of future attacks, as well as an unprecedented expansion of federal government data mining activities, many involving databases containing personal information. We present an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included. We also explore their link to the related literature on privacy-preserving data mining. In particular, we focus on the matching problem across databases and the concept of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    VetCompass Australia: A National Big Data Collection System for Veterinary Science

    Get PDF
    VetCompass Australia is veterinary medical records-based research coordinated with the global VetCompass endeavor to maximize its quality and effectiveness for Australian companion animals (cats, dogs, and horses). Bringing together all seven Australian veterinary schools, it is the first nationwide surveillance system collating clinical records on companion-animal diseases and treatments. VetCompass data service collects and aggregates real-time, clinical records for researchers to interrogate, delivering sustainable and cost-effective access to data from hundreds of veterinary practitioners nationwide. Analysis of these clinical records will reveal geographical and temporal trends in the prevalence of inherited and acquired diseases, identify frequently prescribed treatments, revolutionize clinical auditing, help the veterinary profession to rank research priorities, and assure evidence-based companion-animal curricula in veterinary schools. VetCompass Australia will progress in three phases: (1) roll-out of the VetCompass platform to harvest Australian veterinary clinical record data; (2) development and enrichment of the coding (data-presentation) platform; and (3) creation of a world-first, real-time surveillance interface with natural language processing (NLP) technology. The first of these three phases is described in the current article. Advances in the collection and sharing of records from numerous practices will enable veterinary professionals to deliver a vastly improved level of care for companion animals that will improve their quality of life

    Assessing the disclosure protection provided by misclassification for survey microdata

    No full text
    Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect confidentiality. There is a need for ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping and the post randomisation method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other non-sampling errors commonly arising in surveys

    Optimal assignment problem on record linkage

    Get PDF
    We present an application of the Hungarian Method, an optimal assignment graph theory algorithm, to record linkage in order to improve the disclosure risk assessment. We should note that Hungarian Method has O(n^3) complexity; three different methods are presented to reduce its computational cost
    corecore