10,329 research outputs found

    Current challenges in software solutions for mass spectrometry-based quantitative proteomics

    Get PDF
    This work was in part supported by the PRIME-XS project, grant agreement number 262067, funded by the European Union seventh Framework Programme; The Netherlands Proteomics Centre, embedded in The Netherlands Genomics Initiative; The Netherlands Bioinformatics Centre; and the Centre for Biomedical Genetics (to S.C., B.B. and A.J.R.H); by NIH grants NCRR RR001614 and RR019934 (to the UCSF Mass Spectrometry Facility, director: A.L. Burlingame, P.B.); and by grants from the MRC, CR-UK, BBSRC and Barts and the London Charity (to P.C.

    Updates in metabolomics tools and resources: 2014-2015

    Get PDF
    Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

    Toward a Standardized Strategy of Clinical Metabolomics for the Advancement of Precision Medicine

    Get PDF
    Despite the tremendous success, pitfalls have been observed in every step of a clinical metabolomics workflow, which impedes the internal validity of the study. Furthermore, the demand for logistics, instrumentations, and computational resources for metabolic phenotyping studies has far exceeded our expectations. In this conceptual review, we will cover inclusive barriers of a metabolomics-based clinical study and suggest potential solutions in the hope of enhancing study robustness, usability, and transferability. The importance of quality assurance and quality control procedures is discussed, followed by a practical rule containing five phases, including two additional "pre-pre-" and "post-post-" analytical steps. Besides, we will elucidate the potential involvement of machine learning and demonstrate that the need for automated data mining algorithms to improve the quality of future research is undeniable. Consequently, we propose a comprehensive metabolomics framework, along with an appropriate checklist refined from current guidelines and our previously published assessment, in the attempt to accurately translate achievements in metabolomics into clinical and epidemiological research. Furthermore, the integration of multifaceted multi-omics approaches with metabolomics as the pillar member is in urgent need. When combining with other social or nutritional factors, we can gather complete omics profiles for a particular disease. Our discussion reflects the current obstacles and potential solutions toward the progressing trend of utilizing metabolomics in clinical research to create the next-generation healthcare system.11Ysciescopu

    Stable isotopic labeling in proteomics

    Get PDF
    Labeling of proteins and peptides with stable heavy isotopes (deuterium, carbon-13, nitrogen-15, and oxygen-18) is widely used in quantitative proteomics. These are either incorporated metabolically in cells and small organisms, or postmetabolically in proteins and peptides by chemical or enzymatic reactions. Only upon measurement with mass spectrometers holding sufficient resolution, light, and heavy labeled peptide ions or reporter peptide fragment ions segregate and their intensity values are subsequently used for quantification. Targeted use of these labels or mass tags further leads to specific monitoring of diverse aspects of dynamic proteomes. In this review article, commonly used isotope labeling strategies are described, both for quantitative differential protein profiling and for targeted analysis of protein modifications

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Seminal plasma as a source of prostate cancer peptide biomarker candidates for detection of indolent and advanced disease

    Get PDF
    Background:Extensive prostate specific antigen screening for prostate cancer generates a high number of unnecessary biopsies and over-treatment due to insufficient differentiation between indolent and aggressive tumours. We hypothesized that seminal plasma is a robust source of novel prostate cancer (PCa) biomarkers with the potential to improve primary diagnosis of and to distinguish advanced from indolent disease. <br>Methodology/Principal Findings: In an open-label case/control study 125 patients (70 PCa, 21 benign prostate hyperplasia, 25 chronic prostatitis, 9 healthy controls) were enrolled in 3 centres. Biomarker panels a) for PCa diagnosis (comparison of PCa patients versus benign controls) and b) for advanced disease (comparison of patients with post surgery Gleason score <7 versus Gleason score >>7) were sought. Independent cohorts were used for proteomic biomarker discovery and testing the performance of the identified biomarker profiles. Seminal plasma was profiled using capillary electrophoresis mass spectrometry. Pre-analytical stability and analytical precision of the proteome analysis were determined. Support vector machine learning was used for classification. Stepwise application of two biomarker signatures with 21 and 5 biomarkers provided 83% sensitivity and 67% specificity for PCa detection in a test set of samples. A panel of 11 biomarkers for advanced disease discriminated between patients with Gleason score 7 and organ-confined (<pT3a) or advanced (≥pT3a) disease with 80% sensitivity and 82% specificity in a preliminary validation setting. Seminal profiles showed excellent pre-analytical stability. Eight biomarkers were identified as fragments of N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase​,prostatic acid phosphatase, stabilin-2, GTPase IMAP family member 6, semenogelin-1 and -2. Restricted sample size was the major limitation of the study.</br> <br>Conclusions/Significance: Seminal plasma represents a robust source of potential peptide makers for primary PCa diagnosis. Our findings warrant further prospective validation to confirm the diagnostic potential of identified seminal biomarker candidates.</br&gt

    Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data

    Get PDF
    Background: High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested in how mass spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease. Machine learning algorithms are needed to (a) identify these discriminating features and (b) classify unknown spectra based on this feature set. Since the acquired data is usually noisy, the algorithms should be robust against noise and outliers, while the identified feature set should be as small as possible. Results: We present a new algorithm, Sparse Proteomics Analysis (SPA), based on the theory of compressed sensing that allows us to identify a minimal discriminating set of features from mass spectrometry data-sets. We show (1) how our method performs on artificial and real-world data-sets, (2) that its performance is competitive with standard (and widely used) algorithms for analyzing proteomics data, and (3) that it is robust against random and systematic noise. We further demonstrate the applicability of our algorithm to two previously published clinical data-sets

    Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial.

    Get PDF
    Advancements in mass spectrometry-based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much-needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step-by-step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, proBatch , containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology

    Common Proteomic Technologies, Applications, and their Limitations

    Get PDF
    Proteomics refers to the analysis of expression, localization, functions, posttranslational modifications, and interactions of proteins expressed by a genome at a specific condition and at a specific time. Current proteomic tools allow large-scale, high-throughput analyses for the detection, identification, and functional investigation of proteome. In this review, we have focused on the proteomics methods: gel-based and gel-free techniques and discussed their applications and challenges in the field of proteomics.
    corecore