thesis

Exploring correlation structures of metabolomics data for quality control and biomarker discovery

Abstract

Metabolomics is a technology which allows us to probe a wide array of interactions between metabolites. These interactions can be revealed by statistical correlations between metabolite levels that may arise via a range of mechanisms. To measure metabolite levels, two main techniques are used: Liquid Chromatography Mass Spectrometry (LC-MS) and Nuclear Magnetic Resonance (NMR). For the measurement of correlation structure high analytical reproducibility of the assays is required. While NMR has previously been shown to be reproducible, LC-MS, has not been similarly assessed. To assess the reproducibility of LC-MS for urinary metabolomics, a multi-laboratory study was devised. We find that the technology is highly reproducible, both within and between laboratories with CVs of < 17%, < 5s drift and under 10% ppm between labs. In LC-MS, ionisation of a single compound can lead to multiple charged species such as isotopologues, adducts etc. These multiple signals have a high mutual correlation and we show that this allows them to be identified with high sensitivity and specificity. The inferred statistical interactions between different metabolites can also be affected by analytical errors. An algorithm was designed to remove statistical metabolite links that could have been caused by the analytical technique. Using this method, a higher confidence can be placed on the remaining interactions, suggesting that they are potential biological interactions. Finally, most biological interactions are dynamic in nature, leading to correlations through time between metabolite levels. To explore these dynamic links, two temporal approaches were developed. These methods are designed to discover temporal correlations between metabolites and to test whether they vary between bio- logical conditions. We successfully demonstrate the methods in both LC-MS and NMR datasets. Overall, this thesis shows that correlation structure in metabolic profiling data is reliable, can be successfully filtered to improve quality and can be interrogated to reveal a new kind of dynamic metabolic biomarker.Open Acces

    Similar works