Search CORE

3 research outputs found

Automated pathway and reaction prediction facilitates in silico identification of unknown metabolites in human cohort studies

Author: Agakov
Ahluwalia
Allen
Anne M. Evans
Baldassarre
Beger
Borodulin
Boudonck
Breitling
Carlsson
Chen
Creek
Evans
Felix Agakov
Frainay
Fuhrer
Gabi Kastenmüller
Grapov
Helen C. Looker
Helen M. Colhoun
Höllering
Jan D. Quell
Jan Krumsiek
Kanehisa
Kim
Krumsiek
Krumsiek
Lacroix
Leif C. Groop
Marco Colombo
Orchard
Paul McKeigue
Pearson
R Core Team
Robert Mohney
Romero
Ruttkies
Schaefer
Shin
Sumner
Thiele
Ulf de Faire
van Buuren
van der Hooft
van Helden
Veikko Salomaa
Werner
Werner Römisch-Margl
Yin
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Identification of metabolites in non-targeted metabolomics continues to be a bottleneck in metabolomics studies in large human cohorts. Unidentified metabolites frequently emerge in the results of association studies linking metabolite levels to, for example, clinical phenotypes. For further analyses these unknown metabolites must be identified. Current approaches utilize chemical information, such as spectral details and fragmentation characteristics to determine components of unknown metabolites. Here, we propose a systems biology model exploiting the internal correlation structure of metabolite levels in combination with existing biochemical and genetic information to characterize properties of unknown molecules. Levels of 758 metabolites (439 known, 319 unknown) in human blood samples of 2279 subjects were measured using a non-targeted metabolomics platform (LC-MS and GC-MS). We reconstructed the structure of biochemical pathways that are imprinted in these metabolomics data by building an empirical network model based on 1040 significant partial correlations between metabolites. We further added associations of these metabolites to 134 genes from genome-wide association studies as well as reactions and functional relations to genes from the public database Recon 2 to the network model. From the local neighborhood in the network, we were able to predict the pathway annotation of 180 unknown metabolites. Furthermore, we classified 100 pairs of known and unknown and 45 pairs of unknown metabolites to 21 types of reactions based on their mass differences. As a proof of concept, we then looked further into the special case of predicted dehydrogenation reactions leading us to the selection of 39 candidate molecules for 5 unknown metabolites. Finally, we could verify 2 of those candidates by applying LC-MS analyses of commercially available candidate substances. The formerly unknown metabolites X-13891 and X-13069 were shown to be 2-dodecendioic acid and 9-tetradecenoic acid, respectively. Our data-driven approach based on measured metabolite levels and genetic associations as well as information from public resources can be used alone or together with methods utilizing spectral patterns as a complementary, automated and powerful method to characterize unknown metabolites

Crossref

Lund University Publications

Julkari

Edinburgh Research Explorer

PuSH

Application of high-dimensional feature selection: evaluation for genomic prediction in man

Author: A Statnikov
AL Price
B Hayes
B Walsh
BH Smith
BN Howie
C Ambroise
C Kooperberg
CE Lewis
CJ Willer
D Habier
D Habier
D Habier
D Yoon
DJ Balding
DM Evans
EK Speliotes
G de los Campos
G de los Campos
G de los Campos
G Hemani
G Su
HD Daetwyler
HH Maes
HL Allen
I Guyon
J Barrett
J Crossa
J Fan
J Nadaf
J Yang
J-L Jannink
K Bucher
L-C Huang
M Calus
M Kirin
NR Wray
P Donnelly
P Kraft
P Orchard
PM Visscher
R Mihaescu
RA Fisher
RF Brøndum
RL Somorjai
S Berger
S Purcell
T Meuwissen
TA Manolio
TH Meuwissen
TM Morgan
TM Teslovich
U Ober
W Astle
X Yu
Y Saeys
YJ Fan
YS Aulchenko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/05/2015
Field of study

In this study, we investigated the effect of five feature selection approaches on the performance of a mixed model (G-BLUP) and a Bayesian (Bayes C) prediction method. We predicted height, high density lipoprotein cholesterol (HDL) and body mass index (BMI) within 2,186 Croatian and into 810 UK individuals using genome-wide SNP data. Using all SNP information Bayes C and G-BLUP had similar predictive performance across all traits within the Croatian data, and for the highly polygenic traits height and BMI when predicting into the UK data. Bayes C outperformed G-BLUP in the prediction of HDL, which is influenced by loci of moderate size, in the UK data. Supervised feature selection of a SNP subset in the G-BLUP framework provided a flexible, generalisable and computationally efficient alternative to Bayes C; but careful evaluation of predictive performance is required when supervised feature selection has been used

Crossref

PubMed Central

Edinburgh Research Explorer