330 research outputs found

    Neural networks in interpretation of electronic core-level spectra

    Full text link
    We explore the applicability of artificial intelligence for molecular structure - core-level spectrum interpretation. We focus on the electronic Hamiltonian using the H2_2O molecule in the classical-nuclei approximation as our test system. For a systematic view we studied both predicting structures from spectra and, vice versa, spectra from structures, using polynomial approaches and neural networks. We find predicting spectra easier than predicting structures, where a tighter grid of the spectrum improves prediction. However, the accuracy of the structure prediction worsens when moving outwards from the center of mass of the training set in the structural parameter space

    Assessing multivariate gene-metabolome associations with rare variants using Bayesian reduced rank regression

    Get PDF
    Motivation: A typical genome-wide association study searches for associations between single nucleotide polymorphisms (SNPs) and a univariate phenotype. However, there is a growing interest to investigate associations between genomics data and multivariate phenotypes, for example, in gene expression or metabolomics studies. A common approach is to perform a univariate test between each genotype–phenotype pair, and then to apply a stringent significance cutoff to account for the large number of tests performed. However, this approach has limited ability to uncover dependencies involving multiple variables. Another trend in the current genetics is the investigation of the impact of rare variants on the phenotype, where the standard methods often fail owing to lack of power when the minor allele is present in only a limited number of individuals. Results: We propose a new statistical approach based on Bayesian reduced rank regression to assess the impact of multiple SNPs on a high-dimensional phenotype. Because of the method’s ability to combine information over multiple SNPs and phenotypes, it is particularly suitable for detecting associations involving rare variants. We demonstrate the potential of our method and compare it with alternatives using the Northern Finland Birth Cohort with 4702 individuals, for whom genome-wide SNP data along with lipoprotein profiles comprising 74 traits are available. We discovered two genes (XRCC4 and MTHFD2L) without previously reported associations, which replicated in a combined analysis of two additional cohorts: 2390 individuals from the Cardiovascular Risk in Young Finns study and 3659 individuals from the FINRISK study. Availability and implementation: R-code freely available for download at http://users.ics.aalto.fi/pemartti/gene_metabolome/. Contact: [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Characterization of the metabolic profile associated with serum 25-hydroxyvitamin D : a cross-sectional analysis in population-based data

    Get PDF
    Background: Numerous observational studies have observed associations between vitamin D deficiency and cardiometabolic diseases, but these findings might be confounded by obesity. A characterization of the metabolic profile associated with serum 25-hydroxyvitamin D [25(OH)D] levels, in general and stratified by abdominal obesity, may help to untangle the relationship between vitamin D, obesity and cardiometabolic health. Methods: Serum metabolomics measurements were obtained from a nuclear magnetic resonance spectroscopy (NMR)- and a mass spectrometry (MS)-based platform. The discovery was conducted in 1726 participants of the population-based KORA-F4 study, in which the associations of the concentrations of 415 metabolites with 25(OH)D levels were assessed in linear models. The results were replicated in 6759 participants (NMR) and 609 (MS) participants, respectively, of the population-based FINRISK 1997 study. Results: Mean [standard deviation (SD)] 25(OH)D levels were 15.2 (7.5) ng/ml in KORA F4 and 13.8 (5.9) ng/ml in FINRISK 1997; 37 metabolites were associated with 25(OH) D in KORA F4 at P <0.05/415. Of these, 30 associations were replicated in FINRISK 1997 at P <0.05/37. Among these were constituents of (very) large very-low-density lipoprotein and small low-density lipoprotein subclasses and related measures like serum triglycerides as well as fatty acids and measures reflecting the degree of fatty acid saturation. The observed associations were independent of waist circumference and generally similar in abdominally obese and non-obese participants. Conclusions: Independently of abdominal obesity, higher 25(OH)D levels were associated with a metabolite profile characterized by lower concentrations of atherogenic lipids and a higher degree of fatty acid polyunsaturation. These results indicate that the relationship between vitamin D deficiency and cardiometabolic diseases is unlikely to merely reflect obesity-related pathomechanisms.Peer reviewe

    Effects of hormonal contraception on systemic metabolism : cross-sectional and longitudinal evidence

    Get PDF
    Background: Hormonal contraception is commonly used worldwide, but its systemic effects across lipoprotein subclasses, fatty acids, circulating metabolites and cytokines remain poorly understood. Methods: A comprehensive molecular profile (75 metabolic measures and 37 cytokines) was measured for up to 5841 women (age range 24-49 years) from three population-based cohorts. Women using combined oral contraceptive pills (COCPs) or progestin-only contraceptives (POCs) were compared with those who did not use hormonal contraception. Metabolomics profiles were reassessed for 869 women after 6 years to uncover the metabolic effects of starting, stopping and persistently using hormonal contraception. Results: The comprehensive molecular profiling allowed multiple new findings on the metabolic associations with the use of COCPs. They were positively associated with lipoprotein subclasses, including all high-density lipoprotein (HDL) subclasses. The associations with fatty acids and amino acids were strong and variable in direction. COCP use was negatively associated with albumin and positively associated with creatinine and inflammatory markers, including glycoprotein acetyls and several growth factors and interleukins. Our findings also confirmed previous results e.g. for increased circulating triglycerides and HDL cholesterol. Starting COCPs caused similar metabolic changes to those observed cross-sectionally: the changes were maintained in consistent users and normalized in those who stopped using. In contrast, POCs were only weakly associated with metabolic and inflammatory markers. Results were consistent across all cohorts and for different COCP preparations and different types of POC delivery. Conclusions: Use of COCPs causes widespread metabolic and inflammatory effects. However, persistent use does not appear to accumulate the effects over time and the metabolic perturbations are reversed upon discontinuation. POCs have little effect on systemic metabolism and inflammation.Peer reviewe

    Machine learning in interpretation of electronic core-level spectra

    Get PDF
    Electronic transitions involving core-level orbitals offer a localized, atomic-site and element specific peek window into statistical systems such as molecular liquids. Although formally understood, the complex relation between structure and spectrum -- and the effect of statistical averaging of highly differing spectra of individual structures -- render the analysis of an ensemble-averaged core-level spectrum complicated. We explore the applicability of machine learning for molecular structure -- core-level spectrum interpretation. We focus on the electronic Hamiltonian using the \ce{H2O} molecule in the classical-nuclei approximation as our test system. For a systematic view we studied both predicting structures from spectra and, vice versa, spectra from structures, using polynomial approaches and neural networks. We find predicting spectra easier than predicting structures, where a tighter grid (even unphysical) of the spectrum improves prediction, possibly inviting for over-interpretation of the model. The accuracy of the structure prediction worsens when moving outwards from the center of mass of the training set in the structural parameter space, which can not be overcome by model selection based on generalizability.</p

    Coronary artery disease, genetic risk and the metabolome in young individuals [version 1; peer review: 2 approved]

    Get PDF
    AbstractBackground: Genome-wide association studies have identified genetic variants associated with coronary artery disease (CAD) in adults — the leading cause of death worldwide. It often occurs later in life, but variants may impact CAD-relevant phenotypes early and throughout the life-course. Cohorts with longitudinal and genetic data on thousands of individuals are letting us explore the antecedents of this adult disease.Methods: 149 metabolites, with a focus on the lipidome, measured using nuclear magnetic resonance (1H-NMR) spectroscopy, and genotype data were available from 5,905 individuals at ages 7, 15, and 17 years from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Linear regression was used to assess the association between the metabolites and an adult-derived genetic risk score (GRS) of CAD comprising 146 variants. Individual variant-metabolite associations were also examined.Results: The CAD-GRS associated with 118 of 149 metabolites (false discovery rate [FDR] Conclusions: Genetic variants that influence CAD risk in adults are associated with large perturbations in metabolite levels in individuals as young as seven. The variants identified are mostly within lipid-related loci and the metabolites they associated with are primarily linked to lipoproteins. This knowledge could allow for preventative measures, such as increased monitoring of at-risk individuals and perhaps treatment earlier in life, to be taken years before any symptoms of the disease arise.Abstract Background: Genome-wide association studies have identified genetic variants associated with coronary artery disease (CAD) in adults — the leading cause of death worldwide. It often occurs later in life, but variants may impact CAD-relevant phenotypes early and throughout the life-course. Cohorts with longitudinal and genetic data on thousands of individuals are letting us explore the antecedents of this adult disease. Methods: 149 metabolites, with a focus on the lipidome, measured using nuclear magnetic resonance (1H-NMR) spectroscopy, and genotype data were available from 5,905 individuals at ages 7, 15, and 17 years from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Linear regression was used to assess the association between the metabolites and an adult-derived genetic risk score (GRS) of CAD comprising 146 variants. Individual variant-metabolite associations were also examined. Results: The CAD-GRS associated with 118 of 149 metabolites (false discovery rate [FDR] < 0.05), the strongest associations being with low-density lipoprotein (LDL) and atherogenic non-LDL subgroups. Nine of 146 variants in the GRS associated with one or more metabolites (FDR < 0.05). Seven of these are within lipid loci: rs11591147 PCSK9, rs12149545 HERPUD1-CETP, rs17091891 LPL, rs515135 APOB, rs602633 CELSR2-PSRC1, rs651821 APOA5, rs7412 APOE-APOC1. All associated with metabolites in the LDL or atherogenic non-LDL subgroups or both including aggregate cholesterol measures. The other two variants identified were rs112635299 SERPINA1 and rs2519093 ABO. Conclusions: Genetic variants that influence CAD risk in adults are associated with large perturbations in metabolite levels in individuals as young as seven. The variants identified are mostly within lipid-related loci and the metabolites they associated with are primarily linked to lipoproteins. This knowledge could allow for preventative measures, such as increased monitoring of at-risk individuals and perhaps treatment earlier in life, to be taken years before any symptoms of the disease arise

    Genome-wide association and HLA fine-mapping studies identify risk loci and genetic pathways underlying allergic rhinitis

    Get PDF
    Allergic rhinitis is the most common clinical presentation of allergy, affecting 400 million people worldwide, with increasing incidence in westernized countries1,2. To elucidate the genetic architecture and understand the underlying disease mechanisms, we carried out a meta-analysis of allergic rhinitis in 59,762 cases and 152,358 controls of European ancestry and identified a total of 41 risk loci for allergic rhinitis, including 20 loci not previously associated with allergic rhinitis, which were confirmed in a replication phase of 60,720 cases and 618,527 controls. Functional annotation implicated genes involved in various immune pathways, and fine mapping of the HLA region suggested amino acid variants important for antigen binding. We further performed genome-wide association study (GWAS) analyses of allergic sensitization against inhalant allergens and nonallergic rhinitis, which suggested shared genetic mechanisms across rhinitis-related traits. Future studies of the identified loci and genes might identify novel targets for treatment and prevention of allergic rhinitis
    corecore