9 research outputs found

    Computational Tools for the Processing and Analysis of Time-course Metabolomic Data

    Get PDF
    Modern, high-throughput techniques for the acquisition of metabolomic data, combined with an increase in computational power, have provided not only the need for, but also the means to develop and use, methods for the interpretation of large and complex datasets. This thesis investigates the methods by which pertinent information can be extracted from nontargeted metabolomic data and reviews the current state of chemometric methods. The analysis of real-world data and research questions relevant to the agri-food industry reveals several problems for which novel solutions are proposed. Three LC-MS datasets are studied: Medicago, Alopecurus and aged Beef, covering stress resistance, herbicide resistance and product misbranding. The new methods include preprocessing (batch correction, data-filtering), processing (clustering, classification) and visualisation and their use facilitated within a flexible data-to-results pipeline. The resulting software suite with a user-friendly graphical interface is presented, providing a pragmatic realisation of these methods in an easy to access workflow

    Coinfinder: Detecting significant associations and dissociations in pangenomes

    Get PDF
    © 2020 The Authors. The accessory genes of prokaryote and eukaryote pangenomes accumulate by horizontal gene transfer, differential gene loss, and the effects of selection and drift. We have developed Coinfinder, a software program that assesses whether sets of homolo-gous genes (gene families) in pangenomes associate or dissociate with each other (i.e. are ‘coincident’) more often than would be expected by chance. Coinfinder employs a user-supplied phylogenetic tree in order to assess the lineage-dependence (i.e. the phylogenetic distribution) of each accessory gene, allowing Coinfinder to focus on coincident gene pairs whose joint presence is not simply because they happened to appear in the same clade, but rather that they tend to appear together more often than expected across the phylogeny. Coinfinder is implemented in C++, Python3 and R and is freely available under the GNU license from https://​github.​com/​fwhelan/​coinfinder

    Coinfinder: Detecting Significant Associations and Dissociations in Pangenomes

    Get PDF
    The accessory genes of prokaryote and eukaryote pangenomes accumulate by horizontal gene transfer, differential gene loss, and the effects of selection and drift. We have developed Coinfinder, a software program that assesses whether sets of homologous genes (gene families) in pangenomes associate or dissociate with each other (i.e. are “coincident”) more often than would be expected by chance. Coinfinder employs a user-supplied phylogenetic tree in order to assess the lineage-dependence (i.e. the phylogenetic distribution) of each accessory gene, allowing Coinfinder to focus on coincident gene pairs whose joint presence is not simply because they happened to appear in the same clade, but rather that they tend to appear together more often than expected across the phylogeny. Coinfinder is implemented in C++, Python3, and R and is freely available under the GPU license from https://github.com/fwhelan/coinfinder

    MetaboClust : Using interactive time-series cluster analysis to relate metabolomic data with perturbed pathways

    Get PDF
    Motivation Modern analytical techniques such as LC-MS, GC-MS and NMR are increasingly being used to study the underlying dynamics of biological systems by tracking changes in metabolite levels over time. Such techniques are capable of providing information on large numbers of metabolites simultaneously, a feature that is exploited in non-targeted studies. However, since the dynamics of specific metabolites are unlikely to be known a priori this presents an initial subjective challenge as to where the focus of the investigation should be. Whilst a number of feed-forward software tools are available for manipulation of metabolomic data, no tool centralizes on clustering and focus is typically directed by a workflow that is chosen in advance. Results We present an interactive approach to time-course analyses and a complementary implementation in a software package, MetaboClust. This is presented through the analysis of two LC-MS time-course case studies on plants (Medicago truncatula and Alopecurus myosuroides). We demonstrate a dynamic, user-centric workflow to clustering with intrinsic visual feedback at all stages of analysis. The software is used to apply data correction, generate the time-profiles, perform exploratory statistical analysis and assign tentative metabolite identifications. Clustering is used to group metabolites in an unbiased manner, allowing pathway analysis to score metabolic pathways, based on their overlap with clusters showing interesting trends

    A batch correction method for liquid chromatography–mass spectrometry data that does not depend on quality control samples

    Get PDF
    The need for reproducible and comparable results is of increasing importance in non-targeted metabolomic studies, especially when differences between experimental groups are small. Liquid chromatography– mass spectrometry spectra are often acquired batch-wise so that necessary calibrations and cleaning of the instrument can take place. However this may introduce further sources of variation, such as differences in the conditions under which the acquisition of individual batches is performed. Quality control (QC) samples are frequently employed as a means of both judging and correcting this variation. Here we show that the use of QC samples can lead to problems. The non-linearity of the response can result in substantial differences between the recorded intensities of the QCs and experimental samples, making the required adjustment difficult to predict. Furthermore, changes in the response profile between one QC interspersion and the next cannot be accounted for and QC based correction can actually exacerbate the problems by introducing artificial differences. ‘‘Background correction’’ methods utilise all experimental samples to estimate the variation over time rather than relying on the QC samples alone. We compare non-QC correction methods with standard QC correction and demonstrate their success in reducing differences between replicate samples and their potential to highlight differences between experimental groups previously hidden by instrumental variation

    AlacatDesigner-Computational Design of Peptide Concatamers for Protein Quantitation

    No full text
    Protein quantitation via mass spectrometry relies on peptide proxies for the parent protein from which abundances are estimated. Owing to the variability in signal from individual peptides, accurate absolute quantitation usually relies on the addition of an external standard. Typically, this involves stable isotope-labeled peptides, delivered singly or as a concatenated recombinant protein. Consequently, the selection of the most appropriate surrogate peptides and the attendant design in recombinant proteins termed QconCATs are challenges for proteome science. QconCATs can now be built in a "a-la-carte" assembly method using synthetic biology: ALACATs. To assist their design, we present "AlacatDesigner", a tool that supports the peptide selection for recombinant protein standards based on the user's target protein. The user-customizable tool considers existing databases, occurrence in the literature, potential post-translational modifications, predicted miscleavage, predicted divergence of the peptide and protein quantifications, and ionization potential within the mass spectrometer. We show that peptide selections are enriched for good proteotypic and quantotypic candidates compared to empirical data. The software is freely available to use either via a web interface AlacatDesigner, downloaded as a Desktop application or imported as a Python package for the command line interface or in scripts

    Nuclear Magnetic Resonance (NMR) Spectroscopy in Food Science: A Comprehensive Review

    No full text
    corecore