14 research outputs found
A proteomics sample metadata representation for multiomics integration and big data analysis
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.publishedVersio
Galaxy Training: A powerful framework for teaching!
There is an ongoing explosion of scientific datasets being generated, brought on by recent technological advances in many areas of the natural sciences. As a result, the life sciences have become increasingly computational in nature, and bioinformatics has taken on a central role in research studies. However, basic computational skills, data analysis, and stewardship are still rarely taught in life science educational programs, resulting in a skills gap in many of the researchers tasked with analysing these big datasets. In order to address this skills gap and empower researchers to perform their own data analyses, the Galaxy Training Network (GTN) has previously developed the Galaxy Training Platform (https://training.galaxyproject.org), an open access, community-driven framework for the collection of FAIR (Findable, Accessible, Interoperable, Reusable) training materials for data analysis utilizing the user-friendly Galaxy framework as its primary data analysis platform. Since its inception, this training platform has thrived, with the number of tutorials and contributors growing rapidly, and the range of topics extending beyond life sciences to include topics such as climatology, cheminformatics, and machine learning. While initially aimed at supporting researchers directly, the GTN framework has proven to be an invaluable resource for educators as well. We have focused our efforts in recent years on adding increased support for this growing community of instructors. New features have been added to facilitate the use of the materials in a classroom setting, simplifying the contribution flow for new materials, and have added a set of train-the-trainer lessons. Here, we present the latest developments in the GTN project, aimed at facilitating the use of the Galaxy Training materials by educators, and its usage in different learning environments
Reproducible proteomics sample preparation for single FFPE tissue slices using acid-labile surfactant and direct trypsinization
Abstract Background Proteomic analyses of clinical specimens often rely on human tissues preserved through formalin-fixation and paraffin embedding (FFPE). Minimal sample consumption is the key to preserve the integrity of pathological archives but also to deal with minimal invasive core biopsies. This has been achieved by using the acid-labile surfactant RapiGest in combination with a direct trypsinization (DTR) strategy. A critical comparison of the DTR protocol with the most commonly used filter aided sample preparation (FASP) protocol is lacking. Furthermore, it is unknown how common histological stainings influence the outcome of the DTR protocol. Methods Four single consecutive murine kidney tissue specimens were prepared with the DTR approach or with the FASP protocol using both 10 and 30 k filter devices and analyzed by label-free, quantitative liquid chromatographyâtandem mass spectrometry (LCâMS/MS). We compared the different protocols in terms of proteome coverage, relative label-free quantitation, missed cleavages, physicochemical properties and gene ontology term annotations of the proteins. Additionally, we probed compatibility of the DTR protocol for the analysis of common used histological stainings, namely hematoxylin & eosin (H&E), hematoxylin and hemalaun. These were proteomically compared to an unstained control by analyzing four human tonsil FFPE tissue specimens per condition. Results On average, the DTR protocol identified 1841â±â22 proteins in a single, non-fractionated LCâMS/MS analysis, whereas these numbers were 1857â±â120 and 1970â±â28 proteins for the FASP 10 and 30 k protocol. The DTR protocol showed 15% more missed cleavages, which did not adversely affect quantitation and intersample comparability. Hematoxylin or hemalaun staining did not adversely impact the performance of the DTR protocol. A minor perturbation was observed for H&E staining, decreasing overall protein identification by 13%. Conclusions In essence, the DTR protocol can keep up with the FASP protocol in terms of qualitative and quantitative reproducibility and performed almost as well in terms of proteome coverage and missed cleavages. We highlight the suitability of the DTR protocol as a viable and straightforward alternative to the FASP protocol for proteomics-based clinical research
Proteomic Characterization of Prostate Cancer to Distinguish Nonmetastasizing and Metastasizing Primary Tumors and Lymph Node Metastases
Patients with metastatic prostate cancer (PCa) have a poorer prognosis than patients with organ-confined tumors. We strove to uncover the proteome signature of primary PCa and associated lymph node metastases (LNMs) in order to identify proteins that may indicate or potentially promote metastases formation. We performed a proteomic comparative profiling of PCa tissue from radical prostatectomy (RPE) of patients without nodal metastases or relapse at the time of surgical resection (n = 5) to PCa tissue from RPE of patients who suffered from nodal relapse (n = 5). For the latter group, we also included patient-matched tissue of the nodal metastases. All samples were formalin fixed and paraffin embedded. We identified and quantified more than 1200 proteins by liquid chromatography tandem mass spectrometry with subsequent label-free quantification. An increase of ribosomal or proteasomal proteins in LNM (compared to corresponding PCa) became apparent, while extracellular matrix components rather decreased. Immunohistochemistry (IHC) corroborated accumulation of poly-(ADP-ribose)-polymerase 1 and N-myc-downstream-regulated-gene 3, alpha/beta hydrolase domain-containing protein 11, and protein phosphatase slingshot homolog 3 in LNM. These findings strengthen the present interest in examining PARP inhibitors for the treatment of aggressive PCa. IHC also corroborated increased abundance of retinol dehydrogenase 11 in metastasized primary PCa compared to organ-confined PCa. Generally, metastasizing primary tumors were characterized by an enrichment of proteins involved in cellular lipid metabolic processes with concomitant decrease of cell adhesion proteins. This study highlights the usefulness of a combined proteomic-IHC approach to explore novel aspects in tumor biology. Our initial results open novel opportunities for follow-up studies
MOESM3 of Reproducible proteomics sample preparation for single FFPE tissue slices using acid-labile surfactant and direct trypsinization
Additional file 3. Distribution of LFQ intensities for the DTR and FASP protocols. Log2 transformed LFQ intensity distribution depicted for all replicates with DTR in black, FASP 10Â k in dark grey and FASP 30Â k in light grey
Fostering accessible online education using Galaxy as an e-learning platform
The COVID-19 pandemic is shifting teaching to an online setting all over the world. The Galaxy framework facilitates the online learning process and makes it accessible by providing a library of high-quality community-curated training materials, enabling easy access to data and tools, and facilitates sharing achievements and progress between students and instructors. By combining Galaxy with robust communication channels, effective instruction can be designed inclusively, regardless of the students' environments
A Galaxy of informatics resources for MS-based proteomics
International audienceIntroduction: Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. Areas covered: The Galaxy ecosystem meets these requirements by offering a multitude of opensource tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. Expert opinion: The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxybased resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem
Proteometabolomics of initial and recurrent glioblastoma highlights an increased immune cell signature with altered lipid metabolism
International audienceAbstract Background There is an urgent need to better understand the mechanisms associated with the development, progression, and onset of recurrence after initial surgery in glioblastoma (GBM). The use of integrative phenotype-focused -omics technologies such as proteomics and lipidomics provides an unbiased approach to explore the molecular evolution of the tumor and its associated environment. Methods We assembled a cohort of patient-matched initial (iGBM) and recurrent (rGBM) specimens of resected GBM. Proteome and metabolome composition were determined by mass spectrometry-based techniques. We performed neutrophil-GBM cell coculture experiments to evaluate the behavior of rGBM-enriched proteins in the tumor microenvironment. ELISA-based quantitation of candidate proteins was performed to test the association of their plasma concentrations in iGBM with the onset of recurrence. Results Proteomic profiles reflect increased immune cell infiltration and extracellular matrix reorganization in rGBM. ASAH1, SYMN, and GPNMB were highly enriched proteins in rGBM. Lipidomics indicates the downregulation of ceramides in rGBM. Cell analyses suggest a role for ASAH1 in neutrophils and its localization in extracellular traps. Plasma concentrations of ASAH1 and SYNM show an association with time to recurrence. Conclusions We describe the potential importance of ASAH1 in tumor progression and development of rGBM via metabolic rearrangement and showcase the feedback from the tumor microenvironment to plasma proteome profiles. We report the potential of ASAH1 and SYNM as plasma markers of rGBM progression. The published datasets can be considered as a resource for further functional and biomarker studies involving additional -omics technologies
A proteomics sample metadata representation for multiomics integration and big data analysis
The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets