27 research outputs found

    A Profile of the SAIL Databank on the UK Secure Research Platform

    Get PDF
    Background The Secure Anonymised Information Linkage (SAIL) Databank is a national data safe haven of de‑identified datasets principally about the population of Wales, made available in anonymised form to researchers across the world. It was established to enable the vast arrays of data collected about individuals in the course of health and other public service delivery to be made available to answer important questions that could not otherwise be addressed without prohibitive effort. The SAIL Databank is the bedrock of other funded centres relying on the data for research. Approach SAIL is a data repository surrounded by a suite of physical, technical and procedural control measures embodying a proportionate privacy-by-design governance model, informed by public engagement, to safeguard the data and facilitate data utility. SAIL operates on the UK Secure Research Platform (SeRP), which is a customisable technology and analysis platform. Researchers access anonymised data via this secure research environment, from which results can be released following scrutiny for disclosure risk. SAIL data are being used in multiple research areas to evaluate the impact of health and social exposures and policy interventions.    Discussion Lessons learned and their applications include: managing evolving legislative and regulatory requirements; employing multiple, tiered security mechanisms; working hard to increase analytical capacity efficiency; and developing a multi-faceted programme of public engagement. Further work includes: incorporating new data types; enabling alternative means of data access; and developing further efficiencies across our operations. Conclusion SAIL represents an ongoing programme of work to develop and maintain an extensive, whole population data resource for research. Its privacy-by-design model and UK SeRP technology have received international acclaim, and we continually endeavour to demonstrate trustworthiness to support data provider assurance and public acceptability in data use. We strive for further improvement and continue a mutual learning process with our contemporaries in this rapidly developing field

    Markup: A Web-Based Annotation Tool Powered by Active Learning

    Get PDF
    Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid and accurate annotation using custom user configurations, predictive annotation suggestions, and automated mapping suggestions to both domain-specific ontologies, such as the Unified Medical Language System (UMLS), and custom, user-defined ontologies. We demonstrate a real-world use case of how Markup has been used in a healthcare setting to annotate structured information from unstructured clinic letters, where captured annotations were used to build and test NLP applications

    A population level study into health vulnerabilities of mothers and fathers involved in public law care proceedings in Wales, UK between 2011 and 2019

    Get PDF
    IntroductionUnder section 31 of the Children Act 1989, public law care proceedings can be issued if there is concern a child is subject to, or at risk of significant harm, which can lead to removal of a child from parents. Appropriate and effective health and social support are required to potentially prevent some of the need for these proceedings. More comprehensive evidence of the health needs and vulnerabilities of parents will enable enhanced response from family courts and integrated other services.ObjectiveTo examine health vulnerabilities of parents involved in care proceedings in the two-year period prior to involvement.MethodsFamily court data provided by Cafcass Cymru were linked to population-based health records held within the Secure Anonymised Information Linkage Databank. Linked data were available for 8,821 parents of children involved in care proceedings between 2011 and 2019. Findings were benchmarked with reference to a comparison group of parents matched on sex, age, and deprivation (n = 32,006), not subject to care proceedings. Demographic characteristics, overall health service use, and health profiles of parents were examined. Descriptive and statistical tests of independence were used.ResultsNearly half of cohort parents (47.6%) resided in the most deprived quintile. They had higher levels of healthcare use compared to the comparison group across multiple healthcare settings, with the most pronounced differences for emergency department attendances (59.3% vs 37.0%). Health conditions with the largest variation between groups were related to mental health (43.6% vs 16.0%), substance use (19.4% vs 1.6%) and injuries (41.5% vs 23.6%).ConclusionThis study highlights the heightened socioeconomic and health vulnerabilities of parents who experience care proceedings concerning a child. Better understanding of the needs and vulnerabilities of this population may provide opportunities to improve a range of support and preventative interventions that respond to crises in the community

    Genetic influences on epilepsy outcomes: a whole‐exome sequencing and healthcare records data linkage study

    Get PDF
    Objective: This study was undertaken to develop a novel pathway linking genetic data with routinely collected data for people with epilepsy, and to analyze the influence of rare, deleterious genetic variants on epilepsy outcomes. Methods: We linked whole-exome sequencing (WES) data with routinely collected primary and secondary care data and natural language processing (NLP)-derived seizure frequency information for people with epilepsy within the Secure Anonymised Information Linkage Databank. The study participants were adults who had consented to participate in the Swansea Neurology Biobank, Wales, between 2016 and 2018. DNA sequencing was carried out as part of the Epi25 collaboration. For each individual, we calculated the total number and cumulative burden of rare and predicted deleterious genetic variants and the total of rare and deleterious variants in epilepsy and drug metabolism genes. We compared these measures with the following outcomes: (1) no unscheduled hospital admissions versus unscheduled admissions for epilepsy, (2) antiseizure medication (ASM) monotherapy versus polytherapy, and (3) at least 1 year of seizure freedom versus <1 year of seizure freedom. Results: We linked genetic data for 107 individuals with epilepsy (52% female) to electronic health records. Twenty-six percent had unscheduled hospital admissions, and 70% were prescribed ASM polytherapy. Seizure frequency information was linked for 100 individuals, and 10 were seizure-free. There was no significant difference between the outcome groups in terms of the exome-wide and gene-based burden of rare and deleterious genetic variants. Significance: We successfully uploaded, annotated, and linked genetic sequence data and NLP-derived seizure frequency data to anonymized health care records in this proof-of-concept study. We did not detect a genetic influence on real-world epilepsy outcomes, but our study was limited by a small sample size. Future studies will require larger (WES) data to establish genetic variant contribution to epilepsy outcomes

    A case study in distributed team science in research using electronic health records

    Get PDF
    Abstract Introduction Safety issue of the new non-vitamin K Target Specific Oral Anticoagulants (TSOAC) in people who have had an intracranial haemorrhage required large numbers and data from multiple countries in a European study. To support this scientific research project, we report our approach and success in rapidly replicating datasets and analyses across Wales and Scotland.   Objective To develop an approach to rapidly replicate analyses and data which is reproducible and scalable, as an option towards development of an infrastructure that allows for and supports cross-country research within the UK/EU using Electronic Health Records (EHRs).   Methods Advantages and disadvantages of five potential approaches we considered are summarized. Welsh study cohort was generated through linking various datasets held in Secure Anonymous Information Linkage (SAIL) databank in Swansea using data linkage techniques. Scottish study cohort was generated from linking relevant datasets held in multiple data warehouses and brought to the Scottish National Data Safe Haven. Analysts based in Swansea and Edinburgh gained simultaneous access to both data safe havens which allowed for real time viewing and creation of analytical codes. A detailed comparison between Welsh and Scottish data has been conducted on the relevant datasets in this project. A set of high level results have been combined between study cohorts in Wales and Scotland.   Results The study cohort included pseudonymised information of 2,676 individuals in Wales and 4,153 in Scotland, 6,829 in total. A common R code script has been produced to harmonise individual data and outputs, which can be applied to a wide range of scientific projects under cross-centre working requirements.   Conclusion The approach we adopted is the simplest, yet a very efficient and cost-effective method to ensure consistency in analysis and coherence with the governance systems of both Welsh and Scottish safe havens. It can also be considered as an initialisation of developing infrastructure to support research using EHRs across the UK and EU

    The SOLAS air-sea gas exchange experiment (SAGE) 2004

    Get PDF
    Author Posting. © The Author(s), 2010. This is the author's version of the work. It is posted here by permission of Elsevier B.V. for personal use, not for redistribution. The definitive version was published in Deep Sea Research Part II: Topical Studies in Oceanography 58 (2011): 753-763, doi:10.1016/j.dsr2.2010.10.015.The SOLAS air-sea gas exchange experiment (SAGE) was a multiple-objective study investigating gas-transfer processes and the influence of iron fertilisation on biologically driven gas exchange in high-nitrate low-silicic acid low-chlorophyll (HNLSiLC) Sub-Antarctic waters characteristic of the expansive Subpolar Zone of the southern oceans. This paper provides a general introduction and summary of the main experimental findings. The release site was selected from a pre-voyage desktop study of environmental parameters to be in the south-west Bounty Trough (46.5°S 172.5°E) to the south-east of New Zealand and the experiment conducted between mid-March and mid-April 2004. In common with other mesoscale iron addition experiments (FeAX’s), SAGE was designed as a Lagrangian study quantifying key biological and physical drivers influencing the air-sea gas exchange processes of CO2, DMS and other biogenic gases associated with an iron-induced phytoplankton bloom. A dual tracer SF6/3He release enabled quantification of both the lateral evolution of a labelled volume (patch) of ocean and the air-sea tracer exchange at the 10’s of km’s scale, in conjunction with the iron fertilisation. Estimates from the dual-tracer experiment found a quadratic dependency of the gas exchange coefficient on windspeed that is widely applicable and describes air-sea gas exchange in strong wind regimes. Within the patch, local and micrometeorological gas exchange process studies (100 m scale) and physical variables such as near-surface turbulence, temperature microstructure at the interface, wave properties, and wind speed were quantified to further assist the development of gas exchange models for high-wind environments. There was a significant increase in the photosynthetic competence (Fv/Fm) of resident phytoplankton within the first day following iron addition, but in contrast to other FeAX’s, rates of net primary production and column-integrated chlorophyll a concentrations had only doubled relative to the unfertilised surrounding waters by the end of the experiment. After 15 days and four iron additions totalling 1.1 tonne Fe2+, this was a very modest response compared to the other mesoscale iron enrichment experiments. An investigation of the factors limiting bloom development considered co- limitation by light and other nutrients, the phytoplankton seed-stock and grazing regulation. Whilst incident light levels and the initial Si:N ratio were the lowest recorded in all FeAX’s to date, there was only a small seed-stock of diatoms (less than 1% of biomass) and the main response to iron addition was by the picophytoplankton. A high rate of dilution of the fertilised patch relative to phytoplankton growth rate, the greater than expected depth of the surface mixed layer and microzooplankton grazing were all considered as factors that prevented significant biomass accumulation. In line with the limited response, the enhanced biological draw-down of pCO2 was small and masked by a general increase in pCO2 due to mixing with higher pCO2 waters. The DMS precursor DMSP was kept in check through grazing activity and in contrast to most FeAX’s dissolved dimethylsulfide (DMS) concentration declined through the experiment. SAGE is an important low-end member in the range of responses to iron addition in FeAX’s. In the context of iron fertilisation as a geoengineering tool for atmospheric CO2 removal, SAGE has clearly demonstrated that a significant proportion of the low iron ocean may not produce a phytoplankton bloom in response to iron addition.SAGE was jointly funded through the New Zealand Foundation for Research, Science and Technology (FRST) programs (C01X0204) "Drivers and Mitigation of Global Change" and (C01X0223) "Ocean Ecosystems: Their Contribution to NZ Marine Productivity." Funding was also provided for specific collaborations by the US National Science Foundation from grants OCE-0326814 (Ward), OCE-0327779 (Ho), and OCE 0327188 OCE-0326814 (Minnett) and the UK Natural Environment Research Council NER/B/S/2003/00282 (Archer). The New Zealand International Science and Technology (ISAT) linkages fund provided additional funding (Archer and Ziolkowski), and the many collaborator institutions also provided valuable support
    corecore