10 research outputs found

    Enhancing metabolomic data analysis with Progressive Consensus Alignment of NMR Spectra (PCANS)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Nuclear magnetic resonance spectroscopy is one of the primary tools in metabolomics analyses, where it is used to track and quantify changes in metabolite concentrations or profiles in response to perturbation through disease, toxicants or drugs. The spectra generated through such analyses are typically confounded by noise of various types, obscuring the signals and hindering downstream statistical analysis. Such issues are becoming increasingly significant as greater numbers of large-scale systems or longitudinal studies are being performed, in which many spectra from different conditions need to be compared simultaneously.</p> <p>Results</p> <p>We describe a novel approach, termed Progressive Consensus Alignment of Nmr Spectra (PCANS), for the alignment of NMR spectra. Through the progressive integration of many pairwise comparisons, this approach generates a single consensus spectrum as an output that is then used to adjust the chemical shift positions of the peaks from the original input spectra to their final aligned positions. We characterize the performance of PCANS by aligning simulated NMR spectra, which have been provided with user-defined amounts of chemical shift variation as well as inter-group differences as would be observed in control-treatment applications. Moreover, we demonstrate how our method provides better performance than either template-based alignment or binning. Finally, we further evaluate this approach in the alignment of real mouse urine spectra and demonstrate its ability to improve downstream PCA and PLS analyses.</p> <p>Conclusions</p> <p>By avoiding the use of a template or reference spectrum, PCANS allows for the creation of a consensus spectrum that enhances the signals within the spectra while maintaining sample-specific features. This approach is of greatest benefit when complex samples are being analyzed and where it is expected that there will be spectral features unique and/or strongly different between subgroups within the samples. Furthermore, this approach can be potentially applied to the alignment of any data having spectra-like properties.</p

    Systematic approaches to integrate inconsistent, noisy high-throughput data to bolster subtle relationships obscured by standard analyses

    Get PDF
    The increasing availability and decreasing cost of high throughput technologies coupled with the availability of computational tools form a basis for a shift to a more integrated approach to analyzing biological processes. In particular, classical statistical analysis techniques are designed to analyze data characterized by a single data source and are distinguished by a much higher ratio of subjects to the number of observations. In contrast, bioinformatics and systems biology applications often involve large data sets characterized by an abundance of observations spawned from a relatively small sample of subjects. The complexity of these systems coupled with the need to integrate inconsistent (noisy) data require appropriate methodologies that address these issues. Standard analyses can proficiently identify associations within consistent data, but these approaches are not robust at identifying relationships across data sources and/or where nontrivial amounts of inconsistency (noise) are present. Such data requires approaches that account for this increasing inconsistency within the data. One technique of accounting for such inconsistency is to limit analyses to subsets of data where the desired associations are the most prominent. Challenges for this particular approach involve the determination of subsets of interest while simultaneously establishing a metric with which to judge statistical importance. My initial work using this approach involved providing a methodology to represent Nuclear Magnetic Resonance (NMR) Spectra as hundreds of aligned peaks as opposed to thousands of unaligned points, which allows for more sophisticated means of analysis. My later work explores the development of data mining methodologies for identifying associations that exist within subsets of inconsistent, noisy data while addressing how to sensibly target subsets of interest while establishing a metric of association that provides statistical significance. Two approaches were developed, the first of which established a p-value associated metric, while the latter allowed for multiple arbitrary metrics of interest to be used to identify statistically significant patterns. This work helps to establish methodologies for the identification of rare, but significant patterns in large noisy data sets.Doctor of Philosoph

    Genome-wide association study of metabolic traits reveals novel gene-metabolite-disease links.

    Get PDF
    Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on (1)H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P&lt;5×10(-8)) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10(-44)) and lysine (rs8101881, P = 1.2×10(-33)), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers

    Estudo metabolômico de cultivares de caqui(Diospyros kaki) durante diferentes estágios de desenvolvimento através da RMN HR-MAS de 1H aliada à quimiometria

    Get PDF
    Orientador : Prof. Dr. Andersson BarisonCoorientador : Prof. Dr. Ricardo AyubTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Química. Defesa: Curitiba, 27/04/2016Inclui referências : f. 80-95;109Resumo: Caqui (Diospyros kaki) é uma das principais fruteiras brasileira, caracterizada pela grande quantidade de cultivares. A investigação dos metabólitos primários e secundários presentes nas frutas tem levado a um cenário inconclusivo. Tal fato pode ser justificado pelo planejamento experimental que vem sendo adotado. Além do uso de métodos analíticos pouco apropriados, a influência do real estado de desenvolvimento do fruto não tem sido devidamente considerada. Dessa forma, esse trabalho reporta a investigação das mudanças metabólicas durante todo o desenvolvimento dos cultivares de caqui 'Fuyu' e 'Giombo' através da RMN HR-MAS de 1H e análise quimiométrica. O uso dessa técnica permitiu a aquisição dos espectros diretamente dos frutos, minimizando as desvantagens dos métodos de extração, bem como, a atividade enzimática da invertase. Desta forma, espectros de 1H de 140 caquis cultivados nas mesmas condições ambientais foram adquiridos, sendo 70 frutos de cada cultivar. A análise desses espectros revelou a baixa concentração da treonina, alanina, citrulina, GABA e ácido málico. Por outro lado, os sinais de carboidratos referentes a sacarose, glicose e frutose desempenharam um importante papel no desenvolvimento dos frutos. As tendências desses açúcares pôde ser descrita pelo aumento e diminuição da sacarose, seguido pelo aumento contínuo da área dos sinais da glicose e frutose. Ademais, foi observado uma notável diferença na região aromática dos dois cultivares no estágio inicial do processo. Para o cultivar 'Giombo', o sinal referente ao ácido gálico permaneceu até o final do desenvolvimento, enquanto que para o cultivar 'Fuyu', sinais referentes a polifenois foram detectados somente no estágio inicial. A análise multivariada mostrou que os dois cultivares se desenvolvem de forma similar, sendo influenciado por ácidos orgânicos, aminoácidos, polifenois e colina no primeiro mês, enquanto que para os meses restantes as mudanças foram associadas à presença de açúcares. Esses resultados podem ajudar na compreensão dos mecanismos do desenvolvimento dos frutos, que por sua vez, impactam a qualidade dos mesmos. Além disso, esse trabalho pode contribuir no desenvolvimento de novas estratégias de pós-colheita. Palavras-chave: Caqui. RMN HR-MAS. MetabolômicaAbstract: Persimmon (Diospyros kaki) is an important horticultural crop which has many cultivated varieties. The investigation of primary and secondary metabolites in kaki fruits has led to an inconclusive results in literature. Such scenario has been attributed to the experimental design adopted. Besides the usage of unappropriated analytical methods, the influence of the actual stage of development of fruits has not been properly taken into consideration. In this context, the metabolic changes of two persimmon cultivars ('Fuyu' and Giombo') during the whole fruit development (from September to March) were evaluated by means of 1H HRMAS-NMR along with chemometric analysis. The usage of HRMAS-NMR allowed the acquisition of spectra directly from the fruits avoiding the influence of extraction method as well as of enzymatic activity. For that purpose, it was acquired spectra of 140 persimmons cultivated under the same environment conditions, being 70 fruits of each cultivar. The visual analysis of these spectra revealed the low concentration of amino acids (threonine, alanine, citrulline and GABA) and organic acids (malic acid). On the other hand, the signals of carbohydrates (sucrose, glucose and fructose) seemed to play the most important role in the fruit development. The trends of sugars during the process could be described by the increase and decrease of sucrose, followed by, the continuous increase of glucose and fructose signal intensity. In addition, it was observed a noticeable difference in the aromatic region of the two cultivars at the initial stage of development. For Giombo, the signal related to the gallic acid remained until the end of the growth, while for Fuyu, signals of polyphenols were detected only at the initial stage. The holistic view offered by the multivariate analysis showed that the two cultivars develop in a similar manner. Such process was influenced by organic acid, amino acid, polyphenols and choline in the initial stage while for the rest of the months the changes were associated to the presence of sugars. These findings might help to the comprehension of fruit development, which in turn, impacts the quality of the fruits. Furthermore, they may lead to the improvement of the post-harvest strategies. Key-words: Persimmon. NMR HR-MAS. Metabolomics

    A Consensus Model for Electroencephalogram Data Via the S-Transform

    Get PDF
    A consensus model combines statistical methods with signal processing to create a better picture of the family of related signals. In this thesis, we will consider 32 signals produced by a single electroencephalogram (EEG) recording session. The consensus model will be produced by using the S-Transform of the individual signals and then normalized to unit energy. A bootstrapping process is used to produce a consensus spectrum. This leads to the consensus model via the inverse S-Transform of the consensus spectrum. The method will be applied to both a control and experimental EEG to show how the results can be used in clinical settings to analyze experimental outcomes

    Metabolic profiling for biomarker discovery in biochemical genetics

    No full text
    Functional characterization of the phenotypic consequences of genetic variants increasingly constitutes the rate-limiting step in the study of biochemical genetics. This thesis presents the application of metabolic profiling technologies, including Nuclear Magnetic Resonance (NMR) spectroscopy and Mass Spectrometry (MS), to identify metabolic perturbations resulting from inborn errors of metabolism, congenital diseases affecting the kidney, and animal models of both genetic mutations and renal pathology. In the case of newborn screening for enzyme deficiencies, two mass spectrometric assays, traditional tandem mass spectrometry (MS/MS) with multiple reaction monitoring and direct injection nanospray high resolution mass spectrometry (ns-HR-MS), were applied to profile dried blood spot (DBS) samples of over 6,000 newborns and identify the metabolic perturbations resulting from 24 congenital disorders of metabolism. To study genetic mutations with more complex phenotypic consequences affecting the kidney, urinary metabolic profiles were evaluated for four congenital kidney diseases. The natural history of cystinosis, showing changes in the urinary profile of cystinosis patients over time and with age, glomerular filtration rate, drug therapy, and transplantation, and a comparison of urinary perturbations seen in these human Mendelian disorders to the perturbations induced by region specific nephrotoxins characterize the urinary metabolic associations with these renal phenotypes. Finally, new computational approaches for analyzing metabolic profiling data are presented and evaluated, including for improving biomarker identification with two-dimensional NMR and increasing metabolomic coverage with high resolution mass spectrometry.Integrating metabolic and genetic diagnostics should enhance understanding of the relationship between genes and health and mechanisms by which genetic variations manifest as inherited disease.Open Acces

    Multiple markers for the non-invasive diagnosis and characterisation of prostate cancer

    Get PDF

    Metabolomics, proteomics, and transcriptomics of Cannabis sativa L. trichomes

    Get PDF
    Cannabis sativa L. trichomes are the main site for synthesizing and storing cannabinoids and other secondary metabolites on this plant. Metabolomics, proteomics, and transcriptomics were applied in this research as analytical approaches in order to study secondary metabolites, proteins, and genes that are synthesized in the trichomes. The function of specific trichomes and their individual parts in the cannabinoid biosynthesis was investigated as well. 1H NMR-based metabolomics has been successfully applied for monitoring the production of metabolites, especially cannabinoids in the Cannabis trichomes during the last weeks of flowering period. Proteomics analysis of the Cannabis trichomes revealed that many enzymes corresponding in the biosynthesis of secondary metabolites, including cannabinoids, flavonoids, and terpenoids were successfully recorded. This finding supported the function of Cannabis trichomes as the main site of secondary metabolite production. Although there is no flavonoid reported from the trichomes, however the identification of enzymes related to its biosynthesis indicated that this compound might be present in this organ. Interestingly, identification of enzymes involved in the biosynthesis of cannabinoids, terpenoids, and flavonoids in the proteomic work has been also confirmed by the detection of their putative transcripts in the cDNA library of Cannabis trichomes. Analysis of cannabinoids in the laser-microdissected trichomes of Cannabis showed that these compounds were detected not only in the head of capitate-stalked trichomes but also in its stem part. This finding suggest that cannabinoid biosynthesis is not only limited to the expected head cells, but also the stems of Cannabis capitate-stalked trichomes might play a role in the cannabinoids biosynthesis.Cannabis sativa L. Trichome sind der hauptsächliche Ort der Cannabinoid-Biosynthese. Im Rahmen dieser Arbeiten wurden Techniken angewandt und verbessert, um das Metabolom, Transkriptom und Proteom der biosynthetisierenden Zellen der sezernierenden Trichome zu erforschen. Die Biosyntheleistung der drei Trichomtypen wurde mit Hilfe von LC-MS, Laser Dissection Mikroskopie und 1H-NMR Metabolomics untersucht, um die Biosynthese über 8 Wochen qualitativ und quantitativ zu erfassen. Weitere Analysen des Proteom zeigten auf, das Genexpression und Funktion biosynthetischer Proteine für Cannabinoide, Flavonoide und Terpene des ätherischen Öls ist stark abhängig von Alter und Blütenbildung ist. Analyse und Annotierung der Gene ist durch die eigens erstellte cDNA Bank ermöglicht worden. Die Ergebnisse zeigen, das eine Biosynthese in den sezernierenden Kopfzellen sehr stark aber auch in den Stielzellen zu finden ist. Auf Grund der Proteomuntersuchungen im ätherischen Öl ist eine Biotransformation im nicht wässrigen Milieu ausgeschlossen worden. Mit Hilfe von 1H-NMR Metabolomics und statistischer PCA-Analyse konnten Strategien entwickelt werden, um Zuchtlinien von Cannabis sativa zu identifizieren und auf Basis des metabolomischen Profil voneinander zu unterscheiden
    corecore