163 research outputs found

    Bayesian Methods for Metabolomics

    Get PDF
    Metabolomics, the large-scale study of small molecules, enables the underlying biochemical activity and state of cells or tissues to be directly captured. Nuclear Magnetic Resonance (NMR) Spectroscopy is one of the major data capturing tech- niques for metabolomics, as it provides highly reproducible, quantitative informa- tion on a wide variety of metabolites. This work presents possible solutions for three problems involved to aid the development of better algorithms for NMR data analy- sis. After reviewing relevant concepts and literature, we first utilise observed NMR chemical shift titration data for a range of urinary metabolites and develop a the- oretical model of chemical shift using a Bayesian statistical framework and model selection procedures to estimate the number of protonation sites, a key parameter to model the relationship between chemical shift variation and pH and usually un- known in uncatalogued metabolites. Secondly, with the aim of obtaining explicit concentration estimates for metabolites from NMR spectra, we discuss a Monte Carlo Co-ordinate Ascent Variational Inference (MC-CAVI) algorithm that com- bines Markov chain Monte Carlo (MCMC) methods with Co-ordinate Ascent VI (CAVI), demonstrate MC-CAVI’s suitability for models with hard constraints and compare MC-CAVI’s performance with that of MCMC in an important complex model used in NMR spectroscopy data analysis. The third distribution seeks to im- prove metabolite identification, one of the biggest bottlenecks in metabolomics and severely hindered by resonance overlapping in one-dimensional NMR spectroscopy. In particular, we present a novel Bayesian method for widely used two-dimensional (2D) 1H J-resolved (JRES) NMR spectroscopy, which has considerable potential to accurately identify and quantify metabolites within complex biological samples, through combining B-spline tight wavelet frames with theoretical templates. We then demonstrate the effectiveness of our approach via analyses of JRES datasets from serum and urine

    Novel biomarkers of renal transplant failure/dysfunction via spectroscopic phenotyping

    Get PDF
    Successful renal transplantation not only improves patients’ quality and duration of life, but also confers a substantial economic healthcare cost saving. With the growing burden of end-stage renal disease and the requirement for renal replacement therapy, strategies to augment transplant success and subsequent graft survival become more vital than ever. Herein, an objective means of characterising renal function across the transplant journey, and appropriately stratifying in accordance to individual contingencies/factors (including the early detection of renal dysfunction), based on metabolism is explored. Patient pairs, recipients and donors, were metabolically phenotyped prior to (24 h) and post (days 1–5) transplantation using a multi-platform analytical approach (i.e., Nuclear Magnetic Resonance Spectroscopy (NMR) and Mass Spectrometry (MS)) of urine and plasma (n = 50). Using advanced statistics, the resulting metabolic profiles were subsequently modelled, and related to multiple clinical phenotypes (and outcomes), to increase the understanding of molecular changes/signatures across transplantation, capturing valuable information pertinent to transplant type, cause, co-morbidity, modality, immunology and complication (p-value < 0.05) – over donors as well as recipients. An attempt to then develop predictive algorithms for the early detection of renal dysfunction was preliminary defined within the confines of the study design, where integrated NMR and MS metabolic data improved patient stratification for complications over clinical measures (receiver operator characteristic area under curve over 0.900) and potentially replace current measures. While prospective/multicentre studies are imperative for subsequent real-world adoption (qualification/validation), the work conducted herein encompassed much of the first stage of marker development – discovery – where metabolic phenotyping renal transplantation has provided a deeper characterisation of patient journeys with new insights into multiple contingencies/factors (including complication). Such findings infer the value of metabolic phenotyping to augment and potentially replace current measures and methods to better inform decision making in the clinic on the individual/precision level.Open Acces

    Biological Networks

    Get PDF
    Networks of coordinated interactions among biological entities govern a myriad of biological functions that span a wide range of both length and time scales—from ecosystems to individual cells and from years to milliseconds. For these networks, the concept “the whole is greater than the sum of its parts” applies as a norm rather than an exception. Meanwhile, continued advances in molecular biology and high-throughput technology have enabled a broad and systematic interrogation of whole-cell networks, allowing the investigation of biological processes and functions at unprecedented breadth and resolution—even down to the single-cell level. The explosion of biological data, especially molecular-level intracellular data, necessitates new paradigms for unraveling the complexity of biological networks and for understanding how biological functions emerge from such networks. These paradigms introduce new challenges related to the analysis of networks in which quantitative approaches such as machine learning and mathematical modeling play an indispensable role. The Special Issue on “Biological Networks” showcases advances in the development and application of in silico network modeling and analysis of biological systems

    Regularisoitu riippuvuuksien mallintaminen geeniekpressio- ja metabolomiikkadatan välillä metabolian säätelyn tutkimuksessa

    Get PDF
    Fusing different high-throughput data sources is an effective way to reveal functions of unknown genes, as well as regulatory relationships between biological components such as genes and metabolites. Dependencies between biological components functioning in the different layers of biological regulation can be investigated using canonical correlation analysis (CCA). However, the properties of the high-throughput bioinformatics data induce many challenges to data analysis: the sample size is often insufficient compared to the dimensionality of the data, and the data pose multi-collinearity due to, for example, co-expressed and co-regulated genes. Therefore, a regularized version of classical CCA has been adopted. An alternative way of introducing regularization to statistical models is to perform Bayesian data analysis with suitable priors. In this thesis, the performance of a new variant of Bayesian CCA called gsCCA is compared to a classical ridge regression regularized CCA (rrCCA) in revealing relevant information shared between two high-throughput data sets. The gsCCA produces a partly similar regulatory effect as the classical CCA but, in addition, the gsCCA introduces a new type of regularization to the data covariance matrices. Both CCA methods are applied to gene expression and metabolic concentration measurements obtained from an oxidative-stress tolerant Arabidopsis thaliana ecotype Col-0, and an oxidative stress sensitive mutant rcd1 as time series under ozone exposure and in a control condition. The aim of this work is to reveal new regulatory mechanisms in the oxidative stress signalling in plants. For the both methods, rrCCA and gsCCA, the thesis illustrates their potential to reveal both already known and new regulatory mechanisms in Arabidopsis thaliana oxidative stress signalling.Bioinformatiikassa erityyppisten mittausaineistojen yhdistäminen on tehokas tapa selvittää tuntemattomien geenien toiminnallisuutta sekä säätelyvuorovaikutuksia eri biologisten komponenttien, kuten geenien ja metaboliittien, välillä. Riippuvuuksia eri biologisilla säätelytasoilla toimivien komponenttien välillä voidaan tutkia kanonisella korrelaatioanalyysilla (canonical correlation analysis, CCA). Bioinformatiikan tietoaineistot aiheuttavat kuitenkin monia haasteita data-analyysille: näytteiden määrä on usein riittämätön verrattuna aineiston piirteiden määrään, ja aineisto on multikollineaarista johtuen esim. yhdessä säädellyistä ja ilmentyvistä geeneistä. Tästä syystä usein käytetään regularisoitua versiota kanonisesta korrelaatioanalyysistä aineiston tilastolliseen analysointiin. Vaihtoehto regularisoidulle analyysille on bayesilainen lähestymistapa yhdessä sopivien priorioletuksien kanssa. Tässä diplomityössä tutkitaan ja vertaillaan uuden bayesilaisen CCA:n sekä klassisen harjanneregressio-regularisoidun CCA:n kykyä löytää oleellinen jaettu informaatio kahden bioinformatiikka-tietoaineiston välillä. Uuden bayesilaisen menetelmän nimi on ryhmittäin harva kanoninen korrelaatioanalyysi. Ryhmittäin harva CCA tuottaa samanlaisen regularisointivaikutuksen kuin harjanneregressio-CCA, mutta lisäksi uusi menetelmä regularisoi tietoaineistojen kovarianssimatriiseja uudella tavalla. Molempia CCA-menetelmiä sovelletaan geenien ilmentymisaineistoon ja metaboliittien konsentraatioaineistoon, jotka on mitattu Arabidopsis thaliana:n hapetus-stressiä sietävästä ekotyypistä Col-0 ja hapetus-stressille herkästä rcd1 mutantista aika-sarjana, sekä otsoni-altistuksessa että kontrolliolosuhteissa. Diplomityö havainnollistaa harjanneregressio-CCA:n ja ryhmittäin harvan CCA:n kykyä paljastaa jo tunnettuja ja mahdollisesti uusia säätelymekanismeja geenien ja metabolittien välillä kasvisolujen viestinnässä hapettavan stressin aikana
    • …
    corecore