20,287 research outputs found

    Bioinformatic analysis of proteomics data

    Get PDF
    Most biochemical reactions in a cell are regulated by highly specialized proteins, which are the prime mediators of the cellular phenotype. Therefore the identification, quantitation and characterization of all proteins in a cell are of utmost importance to understand the molecular processes that mediate cellular physiology. With the advent of robust and reliable mass spectrometers that are able to analyze complex protein mixtures within a reasonable timeframe, the systematic analysis of all proteins in a cell becomes feasible. Besides the ongoing improvements of analytical hardware, standardized methods to analyze and study all proteins have to be developed that allow the generation of testable new hypothesis based on the enormous pre-existing amount of biological information. Here we discuss current strategies on how to gather, filter and analyze proteomic data sates using available software packages

    The impact of sequence database choice on metaproteomic results in gut microbiota studies

    Get PDF
    Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources

    A metaproteomic approach to study human-microbial ecosystems at the mucosal luminal interface

    Get PDF
    Aberrant interactions between the host and the intestinal bacteria are thought to contribute to the pathogenesis of many digestive diseases. However, studying the complex ecosystem at the human mucosal-luminal interface (MLI) is challenging and requires an integrative systems biology approach. Therefore, we developed a novel method integrating lavage sampling of the human mucosal surface, high-throughput proteomics, and a unique suite of bioinformatic and statistical analyses. Shotgun proteomic analysis of secreted proteins recovered from the MLI confirmed the presence of both human and bacterial components. To profile the MLI metaproteome, we collected 205 mucosal lavage samples from 38 healthy subjects, and subjected them to high-throughput proteomics. The spectral data were subjected to a rigorous data processing pipeline to optimize suitability for quantitation and analysis, and then were evaluated using a set of biostatistical tools. Compared to the mucosal transcriptome, the MLI metaproteome was enriched for extracellular proteins involved in response to stimulus and immune system processes. Analysis of the metaproteome revealed significant individual-related as well as anatomic region-related (biogeographic) features. Quantitative shotgun proteomics established the identity and confirmed the biogeographic association of 49 proteins (including 3 functional protein networks) demarcating the proximal and distal colon. This robust and integrated proteomic approach is thus effective for identifying functional features of the human mucosal ecosystem, and a fresh understanding of the basic biology and disease processes at the MLI. © 2011 Li et al

    POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis

    Get PDF
    Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively

    Public data and open source tools for multi-assay genomic investigation of disease

    Get PDF
    Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods

    Predicting the outer membrane proteome of Pasteurella multocida based on consensus prediction enhanced by results integration and manual confirmation

    Get PDF
    Background Outer membrane proteins (OMPs) of Pasteurella multocida have various functions related to virulence and pathogenesis and represent important targets for vaccine development. Various bioinformatic algorithms can predict outer membrane localization and discriminate OMPs by structure or function. The designation of a confident prediction framework by integrating different predictors followed by consensus prediction, results integration and manual confirmation will improve the prediction of the outer membrane proteome. Results In the present study, we used 10 different predictors classified into three groups (subcellular localization, transmembrane β-barrel protein and lipoprotein predictors) to identify putative OMPs from two available P. multocida genomes: those of avian strain Pm70 and porcine non-toxigenic strain 3480. Predicted proteins in each group were filtered by optimized criteria for consensus prediction: at least two positive predictions for the subcellular localization predictors, three for the transmembrane β-barrel protein predictors and one for the lipoprotein predictors. The consensus predicted proteins were integrated from each group into a single list of proteins. We further incorporated a manual confirmation step including a public database search against PubMed and sequence analyses, e.g. sequence and structural homology, conserved motifs/domains, functional prediction, and protein-protein interactions to enhance the confidence of prediction. As a result, we were able to confidently predict 98 putative OMPs from the avian strain genome and 107 OMPs from the porcine strain genome with 83% overlap between the two genomes. Conclusions The bioinformatic framework developed in this study has increased the number of putative OMPs identified in P. multocida and allowed these OMPs to be identified with a higher degree of confidence. Our approach can be applied to investigate the outer membrane proteomes of other Gram-negative bacteria

    Sex-partitioning of the <i>Plasmodium falciparum</i> stage V gametocyte proteome provides insight into <i>falciparum</i>-specific cell biology

    Get PDF
    One of the critical gaps in malaria transmission biology and surveillance is our lack of knowledge about Plasmodium falciparum gametocyte biology, especially sexual dimorphic development and how sex ratios that may influence transmission from the human to the mosquito. Dissecting this process has been hampered by the lack of sex-specific protein markers for the circulating, mature stage V gametocytes. The current evidence suggests a high degree of conservation in gametocyte gene complement across Plasmodium, and therefore presumably for sex-specific genes as well. To better our understanding of gametocyte development and subsequent infectiousness to mosquitoes, we undertook a Systematic Subtractive Bioinformatic analysis (filtering) approach to identify sex-specific P. falciparum NF54 protein markers based on a comparison with the Dd2 strain, which is defective in producing males, and with syntenic male and female proteins from the reanalyzed and updated P. berghei (related rodent malaria parasite) gametocyte proteomes. This produced a short list of 174 male- and 258 female-enriched P. falciparum stage V proteins, some of which appear to be under strong diversifying selection, suggesting ongoing adaptation to mosquito vector species. We generated antibodies against three putative female-specific gametocyte stage V proteins in P. falciparum and confirmed either conserved sex-specificity or the lack of cross-species sex-partitioning. Finally, our study provides not only an additional resource for mass spectrometry-derived evidence for gametocyte proteins but also lays down the foundation for rational screening and development of novel sex-partitioned protein biomarkers and transmission-blocking vaccine candidates

    Proteomic analysis of Bifidobacterium longum subsp. infantis reveals the metabolic insight on consumption of prebiotics and host glycans.

    Get PDF
    Bifidobacterium longum subsp. infantis is a common member of the intestinal microbiota in breast-fed infants and capable of metabolizing human milk oligosaccharides (HMO). To investigate the bacterial response to different prebiotics, we analyzed both cell wall associated and whole cell proteins in B. infantis. Proteins were identified by LC-MS/MS followed by comparative proteomics to deduce the protein localization within the cell. Enzymes involved in the metabolism of lactose, glucose, galactooligosaccharides, fructooligosaccharides and HMO were constitutively expressed exhibiting less than two-fold change regardless of the sugar used. In contrast, enzymes in N-Acetylglucosamine and sucrose catabolism were induced by HMO and fructans, respectively. Galactose-metabolizing enzymes phosphoglucomutase, UDP-glucose 4-epimerase and UTP glucose-1-P uridylytransferase were expressed constitutively, while galactokinase and galactose-1-phosphate uridylyltransferase, increased their expression three fold when HMO and lactose were used as substrates for cell growth. Cell wall-associated proteomics also revealed ATP-dependent sugar transport systems associated with consumption of different prebiotics. In addition, the expression of 16 glycosyl hydrolases revealed the complete metabolic route for each substrate. Mucin, which possesses O-glycans that are structurally similar to HMO did not induced the expression of transport proteins, hydrolysis or sugar metabolic pathway indicating B. infantis do not utilize these glycoconjugates

    Technical phosphoproteomic and bioinformatic tools useful in cancer research

    Get PDF
    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools
    corecore