23 research outputs found

    Characterization of missing values in untargeted MS-based metabolomics data and evaluation of missing data handling strategies.

    Get PDF
    BACKGROUND: Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation. METHODS: We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci. RESULTS: Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (KNN) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable. CONCLUSION: Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that KNN-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.This work was supported by grants from the German Federal Ministry of Education and Research (BMBF), by BMBF Grant No. 01ZX1313C (project e:Athero-MED) and Grant No. 03IS2061B (project Gani_Med). Moreover, the research leading to these results has received funding from the European Union’s Seventh Framework Programme [FP7-Health-F5-2012] under grant agreement No. 305280 (MIMOmics) and from the European Research Council (starting grant “LatentCauses”). KS is supported by Biomedical Research Program funds at Weill Cornell Medical College in Qatar, a program funded by the Qatar Foundation. The KORA Augsburg studies were financed by the Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany and supported by grants from the German Federal Ministry of Education and Research (BMBF). Analyses in the EPIC-Norfolk study were supported by funding from the Medical Research Council (MC_PC_13048 and MC_UU_12015/1)

    Metabolite ratios as potential biomarkers for type 2 diabetes:a DIRECT study

    Get PDF
    Aims/hypothesis Circulating metabolites have been shown to reflect metabolic changes during the development of type 2 diabetes. In this study we examined the association of metabolite levels and pairwise metabolite ratios with insulin responses after glucose, glucagon-like peptide-1 (GLP-1) and arginine stimulation. We then investigated if the identified metabolite ratios were associated with measures of OGTT-derived beta cell function and with prevalent and incident type 2 diabetes. Methods We measured the levels of 188 metabolites in plasma samples from 130 healthy members of twin families (from the Netherlands Twin Register) at five time points during a modified 3 h hyperglycaemic clamp with glucose, GLP-1 and arginine stimulation. We validated our results in cohorts with OGTT data (n = 340) and epidemiological case–control studies of prevalent (n = 4925) and incident (n = 4277) diabetes. The data were analysed using regression models with adjustment for potential confounders. Results There were dynamic changes in metabolite levels in response to the different secretagogues. Furthermore, several fasting pairwise metabolite ratios were associated with one or multiple clamp-derived measures of insulin secretion (all p Conclusion/interpretation In this study we have shown that the Val_PC ae C32:2 metabolite ratio is associated with an increased risk of type 2 diabetes and measures of insulin secretion and resistance. The observed effects were stronger than that of the individual metabolites and independent of known risk factors.</p

    Exome-Derived Adiponectin-Associated Variants Implicate Obesity and Lipid Biology

    Get PDF
    Circulating levels of adiponectin, an adipocyte-secreted protein associated with cardiovascular and metabolic risk, are highly heritable. To gain insights into the biology that regulates adiponectin levels, we performed an exome array meta-analysis of 265,780 genetic variants in 67,739 individuals of European, Hispanic, African American, and East Asian ancestry. We identified 20 loci associated with adiponectin, including 11 that had been reported previously (p .60) spanning as much as 900 kb. To identify potential genes and mechanisms through which the previously unreported association signals act to affect adiponectin levels, we assessed cross-trait associations, expression quantitative trait loci in subcutaneous adipose, and biological pathways of nearby genes. Eight of the nine loci were also associated (p <1 x 10(-4)) with at least one obesity or lipid trait. Candidate genes include PRKAR2A, PTH1R, and HDAC9, which have been suggested to play roles in adipocyte differentiation or bone marrow adipose tissue. Taken together, these findings provide further insights into the processes that influence circulating adiponectin levels.Peer reviewe

    Genetic Studies of Leptin Concentrations Implicate Leptin in the Regulation of Early Adiposity.

    Get PDF
    Leptin influences food intake by informing the brain about the status of body fat stores. Rare LEP mutations associated with congenital leptin deficiency cause severe early-onset obesity that can be mitigated by administering leptin. However, the role of genetic regulation of leptin in polygenic obesity remains poorly understood. We performed an exome-based analysis in up to 57,232 individuals of diverse ancestries to identify genetic variants that influence adiposity-adjusted leptin concentrations. We identify five novel variants, including four missense variants, in LEP, ZNF800, KLHL31, and ACTL9, and one intergenic variant near KLF14. The missense variant Val94Met (rs17151919) in LEP was common in individuals of African ancestry only, and its association with lower leptin concentrations was specific to this ancestry (P = 2 × 10-16, n = 3,901). Using in vitro analyses, we show that the Met94 allele decreases leptin secretion. We also show that the Met94 allele is associated with higher BMI in young African-ancestry children but not in adults, suggesting that leptin regulates early adiposity

    pulver: an R package for parallel ultra-rapid p-value computation for linear regression interaction terms

    Get PDF
    Abstract Background Genome-wide association studies allow us to understand the genetics of complex diseases. Human metabolism provides information about the disease-causing mechanisms, so it is usual to investigate the associations between genetic variants and metabolite levels. However, only considering genetic variants and their effects on one trait ignores the possible interplay between different “omics” layers. Existing tools only consider single-nucleotide polymorphism (SNP)–SNP interactions, and no practical tool is available for large-scale investigations of the interactions between pairs of arbitrary quantitative variables. Results We developed an R package called pulver to compute p-values for the interaction term in a very large number of linear regression models. Comparisons based on simulated data showed that pulver is much faster than the existing tools. This is achieved by using the correlation coefficient to test the null-hypothesis, which avoids the costly computation of inversions. Additional tricks are a rearrangement of the order, when iterating through the different “omics” layers, and implementing this algorithm in the fast programming language C++. Furthermore, we applied our algorithm to data from the German KORA study to investigate a real-world problem involving the interplay among DNA methylation, genetic variants, and metabolite levels. Conclusions The pulver package is a convenient and rapid tool for screening huge numbers of linear regression models for significant interaction terms in arbitrary pairs of quantitative variables. pulver is written in R and C++, and can be downloaded freely from CRAN at https://cran.r-project.org/web/packages/pulver/
    corecore