40 research outputs found

    Estimation of Genetic Relationships Between Individuals Across Cohorts and Platforms: Application to Childhood Height

    Get PDF
    Combining genotype data across cohorts increases power to estimate the heritability due to common single nucleotide polymorphisms (SNPs), based on analyzing a Genetic Relationship Matrix (GRM). However, the combination of SNP data across multiple cohorts may lead to stratification, when for example, different genotyping platforms are used. In the current study, we address issues of combining SNP data from different cohorts, the Netherlands Twin Register (NTR) and the Generation R (GENR) study. Both cohorts include children of Northern European Dutch background (N = 3102 + 2826, respectively) who were genotyped on different platforms. We explore imputation and phasing as a tool and compare three GRM-building strategies, when data from two cohorts are (1) just comb

    Bayesian evidence synthesis in case of multi-cohort datasets: An illustration by multi-informant differences in self-control

    Get PDF
    The trend toward large-scale collaborative studies gives rise to the challenge of combining data from different sources efficiently. Here, we demonstrate how Bayesian evidence synthesis can be used to quantify and compare support for competing hypotheses and to aggregate this support over studies. We applied this method to study the ordering of multi-informant scores on the ASEBA Self Control Scale (ASCS), employing a multi-cohort design with data from four Dutch cohorts. Self-control reports were collected from mothers, fathers, teachers and children themselves. The available set of reporters differed between cohorts, so in each cohort varying components of the overarching hypotheses were evaluated. We found consistent support for the partial hypothesis that parents reported more self-control problems than teachers. Furthermore, the aggregated results indicate most support for the combined hypothesis that children report most problem behaviors, followed by their mothers and fathers, and that teachers report the fewest problems. However, there was considerable inconsistency across cohorts regarding the rank order of children's reports. This article illustrates Bayesian evidence synthesis as a method when some of the cohorts only have data to ev

    Harmonising and linking biomedical and clinical data across disparate data archives to enable integrative cross-biobank research

    Get PDF
    A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase

    Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution

    Get PDF
    We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking

    Large-scale plasma metabolome analysis reveals alterations in HDL metabolism in migraine

    Get PDF
    Objective To identify a plasma metabolomic biomarker signature for migraine. Methods Plasma samples from 8 Dutch cohorts (n = 10,153: 2,800 migraine patients and 7,353 controls) were profiled on a 1H-NMR-based metabolomics platform, to quantify 146 individual metabolites (e.g., lipids, fatty acids, and lipoproteins) and 79 metabolite ratios. Metabolite measures associated with migraine were obtained after single-metabolite logistic regression combined with a random-effects meta-analysis performed in a nonstratified and sex-stratified manner. Next, a global test analysis was performed to identify sets of related metabolites associated with migraine. The Holm procedure was applied to control the family-wise error rate at 5% in single-metabolite and global test analyses. Results Decreases in the level of apolipoprotein A1 (β −0.10; 95% confidence interval [CI] −0.16, −0.05; adjusted p = 0.029) and free cholesterol to total lipid ratio present in small high-density lipoprotein subspecies (HDL) (β −0.10; 95% CI −0.15, −0.05; adjusted p = 0.029) were associated with migraine status. In addition, only in male participants, a decreased level of omega-3 fatty acids (β −0.24; 95% CI −0.36, −0.12; adjusted p = 0.033) was associated with migraine. Global test analysis further supported that HDL traits (but not other lipoproteins) were associated with migr

    The Molecular Genetic Architecture of Self-Employment

    Get PDF
    Economic variables such as income, education, and occupation are known to affect mortality and morbidity, such as cardiovascular disease, and have also been shown to be partly heritable. However, very little is known about which genes influence economic variables, although these genes may have both a direct and an indirect effect on health. We report results from the first large-scale collaboration that studies the molecular genetic architecture of an economic variable-entrepreneurship-that was operationalized using self-employment, a widely-available proxy. Our results suggest that common SNPs when considered jointly explain about half of the narrow-sense heritability of self-employment estimated in twin data (σg2/σP2= 25%, h2= 55%). However, a meta-analysis of genome-wide association studies across sixteen studies comprising 50,627 participants did not identify genome-wide significant SNPs. 58 SNPs with p<10-5were tested in a replication sample (n = 3,271), but none replicated. Furthermore, a gene-based test shows that none of the genes that were previously suggested in the literature to influence entrepreneurship reveal significant associations. Finally, SNP-based genetic scores that use results from the meta-analysis capture less than 0.2% of the variance in self-employment in an independent sample (p≥0.039). Our results are consistent with a highly polygenic molecular genetic architecture of self-employment, with many genetic variants of small effect. Although self-employment is a multi-faceted, heavily environmentally influenced, and biologically distal trait, our results are similar to those for other genetically complex and biologically more proximate outcomes, such as height, intelligence, personality, and several diseases

    Genome-wide meta-analysis associates HLA-DQA1/DRB1 and LPA and lifestyle factors with human longevity

    Get PDF
    Genomic analysis of longevity offers the potential to illuminate the biology of human aging. Here, using genome-wide association meta-analysis of 606,059 parents' survival, we discover two regions associated with longevity (HLA-DQA1/DRB1 and LPA). We also validate previous suggestions that APOE, CHRNA3/5, CDKN2A/B, SH2B3 and FOXO3A influence longevity. Next we show that giving up smoking, educational attainment, openness to new experience and high-density lipoprotein (HDL) cholesterol levels are most positively genetically correlated with lifespan while susceptibility to coronary artery disease (CAD), cigarettes smoked per day, lung cancer, insulin resistance and body fat are most negatively correlated. We suggest that the effect of education on lifespan is principally mediated through smoking while the effect of obesity appears to act via CAD. Using instrumental variables, we suggest that an increase of one body mass index unit reduces lifespan by 7 months while 1 year of education adds 11 months to expected lifespan

    Heritability estimates for 361 blood metabolites across 40 genome-wide association studies

    Get PDF
    Metabolomics examines the small molecules involved in cellular metabolism. Approximately 50% of total phenotypic differences in metabolite levels is due to genetic variance, but heritability estimates differ across metabolite classes. We perform a review of all genome-wide association and (exome-) sequencing studies published between November 2008 and October 2018, and identify >800 class-specific metabolite loci associated with metabolite levels. In a twin-family cohort (N = 5117), these metabolite loci are leveraged to simultaneously estimate total heritability (h2 total), and the proportion of heritability captured by known metabolite loci (h2 Metabolite-hits) for 309 lipids and

    Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels

    Get PDF
    Background So far, more than 170 loci have been associated with circulating lipid levels through genomewide association studies (GWAS). These associations are largely driven by common variants, their function is often not known, and many are likely to be markers for the causal variants. In this study we aimed to identify more new rare and low-frequency functional variants associated with circulating lipid levels. Methods We used the 1000 Genomes Project as a reference panel for the imputations of GWAS data from ~60 000 individuals in the discovery stage and ~90 000 samples in the replication stage. Results Our study resu

    Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation

    Get PDF
    Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the fi
    corecore