7 research outputs found

    DataSHIELD: taking the analysis to the data, not the data to the analysis

    No full text
    BACKGROUND: Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. In the UK this has been highlighted by recent debate and controversy relating to the UK's proposed 'care.data' initiative, and these issues reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that can circumvent some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. METHODS: Commands are sent from a central analysis computer (AC) to several data computers (DCs) storing the data to be co-analysed. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. This paper describes the technical implementation of DataSHIELD using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is controlled through a standard R environment at the AC. RESULTS: Based on this Opal/R implementation, DataSHIELD is currently used by the Healthy Obese Project and the Environmental Core Project (BioSHaRE-EU) for the federated analysis of 10 data sets across eight European countries, and this illustrates the opportunities and challenges presented by the DataSHIELD approach. CONCLUSIONS: DataSHIELD facilitates important research in settings where: (i) a co-analysis of individual-level data from several studies is scientifically necessary but governance restrictions prohibit the release or sharing of some of the required data, and/or render data access unacceptably slow; (ii) a research group (e.g. in a developing nation) is particularly vulnerable to loss of intellectual property-the researchers want to fully share the information held in their data with national and international collaborators, but do not wish to hand over the physical data themselves; and (iii) a data set is to be included in an individual-level co-analysis but the physical size of the data precludes direct transfer to a new site for analysis

    Cystatin C and Cardiovascular Disease: A Mendelian Randomization Study.

    Full text link
    BACKGROUND: Epidemiological studies show that high circulating cystatin C is associated with risk of cardiovascular disease (CVD), independent of creatinine-based renal function measurements. It is unclear whether this relationship is causal, arises from residual confounding, and/or is a consequence of reverse causation. OBJECTIVES: The aim of this study was to use Mendelian randomization to investigate whether cystatin C is causally related to CVD in the general population. METHODS: We incorporated participant data from 16 prospective cohorts (n = 76,481) with 37,126 measures of cystatin C and added genetic data from 43 studies (n = 252,216) with 63,292 CVD events. We used the common variant rs911119 in CST3 as an instrumental variable to investigate the causal role of cystatin C in CVD, including coronary heart disease, ischemic stroke, and heart failure. RESULTS: Cystatin C concentrations were associated with CVD risk after adjusting for age, sex, and traditional risk factors (relative risk: 1.82 per doubling of cystatin C; 95% confidence interval [CI]: 1.56 to 2.13; p = 2.12 × 10(-14)). The minor allele of rs911119 was associated with decreased serum cystatin C (6.13% per allele; 95% CI: 5.75 to 6.50; p = 5.95 × 10(-211)), explaining 2.8% of the observed variation in cystatin C. Mendelian randomization analysis did not provide evidence for a causal role of cystatin C, with a causal relative risk for CVD of 1.00 per doubling cystatin C (95% CI: 0.82 to 1.22; p = 0.994), which was statistically different from the observational estimate (p = 1.6 × 10(-5)). A causal effect of cystatin C was not detected for any individual component of CVD. CONCLUSIONS: Mendelian randomization analyses did not support a causal role of cystatin C in the etiology of CVD. As such, therapeutics targeted at lowering circulating cystatin C are unlikely to be effective in preventing CVD

    52 Genetic Loci Influencing Myocardial Mass.

    Full text link
    BACKGROUND: Myocardial mass is a key determinant of cardiac muscle function and hypertrophy. Myocardial depolarization leading to cardiac muscle contraction is reflected by the amplitude and duration of the QRS complex on the electrocardiogram (ECG). Abnormal QRS amplitude or duration reflect changes in myocardial mass and conduction, and are associated with increased risk of heart failure and death. OBJECTIVES: This meta-analysis sought to gain insights into the genetic determinants of myocardial mass. METHODS: We carried out a genome-wide association meta-analysis of 4 QRS traits in up to 73,518 individuals of European ancestry, followed by extensive biological and functional assessment. RESULTS: We identified 52 genomic loci, of which 32 are novel, that are reliably associated with 1 or more QRS phenotypes at p < 1 × 10(-8). These loci are enriched in regions of open chromatin, histone modifications, and transcription factor binding, suggesting that they represent regions of the genome that are actively transcribed in the human heart. Pathway analyses provided evidence that these loci play a role in cardiac hypertrophy. We further highlighted 67 candidate genes at the identified loci that are preferentially expressed in cardiac tissue and associated with cardiac abnormalities in Drosophila melanogaster and Mus musculus. We validated the regulatory function of a novel variant in the SCN5A/SCN10A locus in vitro and in vivo. CONCLUSIONS: Taken together, our findings provide new insights into genes and biological pathways controlling myocardial mass and may help identify novel therapeutic targets

    Systematic Evaluation of Pleiotropy Identifies 6 Further Loci Associated With Coronary Artery Disease

    No full text
    BACKGROUND: Genome-wide association studies have so far identified 56 loci associated with risk of coronary artery disease (CAD). Many CAD loci show pleiotropy; that is, they are also associated with other diseases or traits. OBJECTIVES: This study sought to systematically test if genetic variants identified for non-CAD diseases/traits also associate with CAD and to undertake a comprehensive analysis of the extent of pleiotropy of all CAD loci. METHODS: In discovery analyses involving 42,335 CAD cases and 78,240 control subjects we tested the association of 29,383 common (minor allele frequency >5%) single nucleotide polymorphisms available on the exome array, which included a substantial proportion of known or suspected single nucleotide polymorphisms associated with common diseases or traits as of 2011. Suggestive association signals were replicated in an additional 30,533 cases and 42,530 control subjects. To evaluate pleiotropy, we tested CAD loci for association with cardiovascular risk factors (lipid traits, blood pressure phenotypes, body mass index, diabetes, and smoking behavior), as well as with other diseases/traits through interrogation of currently available genome-wide association study catalogs. RESULTS: We identified 6 new loci associated with CAD at genome-wide significance: on 2q37 (KCNJ13-GIGYF2), 6p21 (C2), 11p15 (MRVI1-CTR9), 12q13 (LRP1), 12q24 (SCARB1), and 16q13 (CETP). Risk allele frequencies ranged from 0.15 to 0.86, and odds ratio per copy of the risk allele ranged from 1.04 to 1.09. Of 62 new and known CAD loci, 24 (38.7%) showed statistical association with a traditional cardiovascular risk factor, with some showing multiple associations, and 29 (47%) showed associations at p < 1 × 10(-4) with a range of other diseases/traits. CONCLUSIONS: We identified 6 loci associated with CAD at genome-wide significance. Several CAD loci show substantial pleiotropy, which may help us understand the mechanisms by which these loci affect CAD risk

    Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function

    No full text
    Reduced glomerular filtration rate defines chronic kidney disease and is associated with cardiovascular and all-cause mortality. We conducted a meta-analysis of genome-wide association studies for estimated glomerular filtration rate (eGFR), combining data across 133,413 individuals with replication in up to 42,166 individuals. We identify 24 new and confirm 29 previously identified loci. Of these 53 loci, 19 associate with eGFR among individuals with diabetes. Using bioinformatics, we show that identified genes at eGFR loci are enriched for expression in kidney tissues and in pathways relevant for kidney development and transmembrane transporter activity, kidney structure, and regulation of glucose metabolism. Chromatin state mapping and DNase I hypersensitivity analyses across adult tissues demonstrate preferential mapping of associated variants to regulatory regions in kidney but not extra-renal tissues. These findings suggest that genetic determinants of eGFR are mediated largely through direct effects within the kidney and highlight important cell types and biological pathways

    Exome-wide association study of plasma lipids in >300,000 individuals.

    Full text link
    We screened variants on an exome-focused genotyping array in >300,000 participants (replication in >280,000 participants) and identified 444 independent variants in 250 loci significantly associated with total cholesterol (TC), high-density-lipoprotein cholesterol (HDL-C), low-density-lipoprotein cholesterol (LDL-C), and/or triglycerides (TG). At two loci (JAK2 and A1CF), experimental analysis in mice showed lipid changes consistent with the human data. We also found that: (i) beta-thalassemia trait carriers displayed lower TC and were protected from coronary artery disease (CAD); (ii) excluding the CETP locus, there was not a predictable relationship between plasma HDL-C and risk for age-related macular degeneration; (iii) only some mechanisms of lowering LDL-C appeared to increase risk for type 2 diabetes (T2D); and (iv) TG-lowering alleles involved in hepatic production of TG-rich lipoproteins (TM6SF2 and PNPLA3) tracked with higher liver fat, higher risk for T2D, and lower risk for CAD, whereas TG-lowering alleles involved in peripheral lipolysis (LPL and ANGPTL4) had no effect on liver fat but decreased risks for both T2D and CAD

    Correction: The Influence of Age and Sex on Genetic Associations with Adult Body Size and Shape: A Large-Scale Genome-Wide Interaction Study.

    No full text
    The arcOGEN Consortium should be listed as an author of this article. They contributed to the genome-wide association study results presented in this work. They should be listed in the author byline at position 292 and affiliated with The Arthritis Research UK Osteoarthritis Genetics Consortium. They should also be included in the footnote designating consortia which is underneath the author affiliation list in the PDF version of the article, and in the S2 Text. Please view the correct S2 Text below, containing correct consortia members. S2 Text. Consortia members and extended acknowledgments. https://doi.org/10.1371/journal.pgen.1006166.s001 (DOCX) [This corrects the article DOI: 10.1371/journal.pgen.1005378.]
    corecore