5 research outputs found

    DataSHIELD: taking the analysis to the data, not the data to the analysis

    Get PDF
    Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. In the UK this has been highlighted by recent debate and controversy relating to the UK's proposed 'care.data' initiative, and these issues reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that can circumvent some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. Commands are sent from a central analysis computer (AC) to several data computers (DCs) storing the data to be co-analysed. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. This paper describes the technical implementation of DataSHIELD using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is controlled through a standard R environment at the AC. Based on this Opal/R implementation, DataSHIELD is currently used by the Healthy Obese Project and the Environmental Core Project (BioSHaRE-EU) for the federated analysis of 10 data sets across eight European countries, and this illustrates the opportunities and challenges presented by the DataSHIELD approach. DataSHIELD facilitates important research in settings where: (i) a co-analysis of individual-level data from several studies is scientifically necessary but governance restrictions prohibit the release or sharing of some of the required data, and/or render data access unacceptably slow; (ii) a research group (e.g. in a developing nation) is particularly vulnerable to loss of intellectual property-the researchers want to fully share the information held in their data with national and international collaborators, but do not wish to hand over the physical data themselves; and (iii) a data set is to be included in an individual-level co-analysis but the physical size of the data precludes direct transfer to a new site for analysis

    How to assess common somatic symptoms in large-scale studies:A systematic review of questionnaires

    No full text
    Objective: Many questionnaires for assessment of common somatic symptoms or functional somatic symptoms are available and their use differs greatly among studies. The prevalence and incidence of symptoms are partially determined by the methods used to assess them. As a result, comparison across studies is difficult. This article describes a systematic review of self-report questionnaires for somatic symptoms for use in large-scale studies and recommends two questionnaires for use in such studies.Methods: A literature search was performed in the databases Medline, PsycINFO and EMBASE. Articles that reported the development, evaluation, or review of a self-report somatic symptom measure were included. Instrument evaluation was based on validity and reliability, and their fitness for purpose in large scale studies, according to the PhenX criteria.Results: The literature search identified 40 questionnaires. The number of items within the questionnaires ranged from 5 to 78 items. In 70% of the questionnaires, headaches were included, followed by nausea/upset stomach. (65%), shortness of breath/breathing trouble (58%), dizziness (55%), and (low) back pain/backaches (55%). Data on validity and reliability were reported and used for evaluation.Conclusion: Questionnaires varied regarding usability and burden to participants, and relevance to a variety of populations and regions. Based on our criteria, the Patient Health Questionnaire-15 and the Symptom Checklist-90 somatization scale seem the most fit for purpose for use in large-scale studies. These two questionnaires have well-established psychometric properties, contain relevant symptoms, are relatively short, and are available in multiple languages. (C) 2013 Elsevier Inc. All rights reserved.</p

    Long-term exposure to road traffic noise, ambient air pollution, and cardiovascular risk factors in the HUNT and lifelines cohorts

    Get PDF
    Aims: Blood biochemistry may provide information on associations between road traffic noise, air pollution, and cardiovascular disease risk. We evaluated this in two large European cohorts (HUNT3, Lifelines). Methods and results: Road traffic noise exposure was modelled for 2009 using a simplified version of the Common Noise Assessment Methods in Europe (CNOSSOS-EU). Annual ambient air pollution (PM10, NO2) at residence was estimated for 2007 using a Land Use Regression model. The statistical platform DataSHIELD was used to pool data from 144 082 participants aged ≥20 years to enable individual-level analysis. Generalized linear models were fitted to assess cross-sectional associations between pollutants and high-sensitivity C-reactive protein (hsCRP), blood lipids and for (Lifelines only) fasting blood glucose, for samples taken during recruitment in 2006-2013. Pooling both cohorts, an inter-quartile range (IQR) higher day-time noise (5.1 dB(A)) was associated with 1.1% [95% confidence interval (95% CI: 0.02-2.2%)] higher hsCRP, 0.7% (95% CI: 0.3-1.1%) higher triglycerides, and 0.5% (95% CI: 0.3-0.7%) higher high-density lipoprotein (HDL); only the association with HDL was robust to adjustment for air pollution. An IQR higher PM10 (2.0 µg/m3) or NO2 (7.4 µg/m3) was associated with higher triglycerides (1.9%, 95% CI: 1.5-2.4% and 2.2%, 95% CI: 1.6-2.7%), independent of adjustment for noise. Additionally for NO2, a significant association with hsCRP (1.9%, 95% CI: 0.5-3.3%) was seen. In Lifelines, an IQR higher noise (4.2 dB(A)) and PM10 (2.4 µg/m3) was associated with 0.2% (95% CI: 0.1-0.3%) and 0.6% (95% CI: 0.4-0.7%) higher fasting glucose respectively, with both remaining robust to adjustment for air/noise pollution. Conclusion: Long-term exposures to road traffic noise and ambient air pollution were associated with blood biochemistry, providing a possible link between road traffic noise/air pollution and cardio-metabolic disease risk
    corecore