1,923 research outputs found

    Prediction of Post-translational Modifications of Proteins from 2-DE/MS Data

    Get PDF
    The living cell is a complex entity consisting of nucleic acids, proteins, and otherbiomolecules that form an interrelated and dynamic network. The unraveling of this network is of great interest for scientists of different disciplines. With the sequencing of the genome a step was made to the understanding of the fundamental elements of the cells the genes. In humans, approximately 20,000 to 25,000 genes exist which encode about more than one million proteins. This complexity at the protein level is a result of alternative splicing and co- and post-translational modifications producing several protein species per transcript. Modifications are essential to the regulation of cellular processes and account for the activation or deactivation of enzymes and whole signaling pathways. The entirety of all proteins present in a cell at a fixed point of time and under particular biological conditions is called proteome, and the analysis of it is proteomics. One particular area of interest in proteomics is the identification of proteins and their post-translational modifications. Peptide mass fingerprinting is an established method and has proved useful to identify proteins by their amino acid sequence using mass spectrometry and protein sequence databases. This method relies on the idea of comparing experimental (measured) mass peaks to theoretical (calculated) masses, the latter being generated from a protein in a sequence database. As the mass of a modified protein differs from the mass of its unmodified counterpart, this mass distance is to be considered when detecting protein modifications with peptide mass fingerprinting. In the work described here, a novel algorithm was developed and implemented that allows for the identification of protein modifications from data derived by peptide mass fingerprinting. The algorithm transformed the process of predicting protein modifications to an extended Money Changing Problem of finding suitable combinations of modifications that explain the observed peak mass distances. Unlike common computational approaches the algorithm presented here will not be restricted in the number of modifications to be considered. Furthermore, this algorithm is efficient by calculating for a given list of modifications the combinations of modifications only once, independent of the number of queries. Although there exist hardly any frequencies of protein modifications, which turns the validation of the results very difficult, this novel approach is a promising step towards the unraveling of protein complexity

    Analytical model of peptide mass cluster centres with applications

    Get PDF
    BACKGROUND: The elemental composition of peptides results in formation of distinct, equidistantly spaced clusters across the mass range. The property of peptide mass clustering is used to calibrate peptide mass lists, to identify and remove non-peptide peaks and for data reduction. RESULTS: We developed an analytical model of the peptide mass cluster centres. Inputs to the model included, the amino acid frequencies in the sequence database, the average length of the proteins in the database, the cleavage specificity of the proteolytic enzyme used and the cleavage probability. We examined the accuracy of our model by comparing it with the model based on an in silico sequence database digest. To identify the crucial parameters we analysed how the cluster centre location depends on the inputs. The distance to the nearest cluster was used to calibrate mass spectrometric peptide peak-lists and to identify non-peptide peaks. CONCLUSION: The model introduced here enables us to predict the location of the peptide mass cluster centres. It explains how the location of the cluster centres depends on the input parameters. Fast and efficient calibration and filtering of non-peptide peaks is achieved by a distance measure suggested by Wool and Smilansky

    NBPMF: Novel Network-Based Inference Methods for Peptide Mass Fingerprinting

    Get PDF
    Proteins are large, complex molecules that perform a vast array of functions in every living cell. A proteome is a set of proteins produced in an organism, and proteomics is the large-scale study of proteomes. Several high-throughput technologies have been developed in proteomics, where the most commonly applied are mass spectrometry (MS) based approaches. MS is an analytical technique for determining the composition of a sample. Recently it has become a primary tool for protein identification, quantification, and post translational modification (PTM) characterization in proteomics research. There are usually two different ways to identify proteins: top-down and bottom-up. Top-down approaches are based on subjecting intact protein ions and large fragment ions to tandem MS directly, while bottom-up methods are based on mass spectrometric analysis of peptides derived from proteolytic digestion, usually with trypsin. In bottom-up techniques, peptide mass fingerprinting (PMF) is widely used to identify proteins from MS dataset. Conventional PMF representatives such as probabilistic MOWSE algorithm, is based on mass distribution of tryptic peptides. In this thesis, we developed a novel network-based inference software termed NBPMF. By analyzing peptide-protein bipartite network, we designed new peptide protein matching score functions. We present two methods: the static one, ProbS, is based on an independent probability framework; and the dynamic one, HeatS, depicts input dataset as dependent peptides. Moreover, we use linear regression to adjust the matching score according to the masses of proteins. In addition, we consider the order of retention time to further correct the score function. In the post processing, we design two algorithms: assignment of peaks, and protein filtration. The former restricts that a peak can only be assigned to one peptide in order to reduce random matches; and the latter assumes each peak can only be assigned to one protein. In the result validation, we propose two new target-decoy search strategies to estimate the false discovery rate (FDR). The experiments on simulated, authentic, and simulated authentic dataset demonstrate that our NBPMF approaches lead to significantly improved performance compared to several state-of-the-art methods

    In-depth analysis of the chicken egg white proteome using an LTQ Orbitrap Velos

    Get PDF
    Abstract Background Hen's egg white has been the subject of intensive chemical, biochemical and food technological research for many decades, because of its importance in human nutrition, its importance as a source of easily accessible model proteins, and its potential use in biotechnological processes. Recently the arsenal of tools used to study the protein components of egg white has been complemented by mass spectrometry-based proteomic technologies. Application of these fast and sensitive methods has already enabled the identification of a large number of new egg white proteins. Recent technological advances may be expected to further expand the egg white protein inventory. Results Using a dual pressure linear ion trap Orbitrap instrument, the LTQ Orbitrap Velos, in conjunction with data analysis in the MaxQuant software package, we identified 158 proteins in chicken egg white with two or more sequence unique peptides. This group of proteins identified with very high confidence included 79 proteins identified in egg white for the first time. In addition, 44 proteins were identified tentatively. Conclusions Our results, apart from identifying many new egg white components, indicate that current mass spectrometry technology is sufficiently advanced to permit direct identification of minor components of proteomes dominated by a few major proteins without resorting to indirect techniques, such as chromatographic depletion or peptide library binding, which change the composition of the proteome.</p

    The emerging landscape of single-molecule protein sequencing technologies

    Get PDF
    Single-cell profiling methods have had a profound impact on the understanding of cellular heterogeneity. While genomes and transcriptomes can be explored at the single-cell level, single-cell profiling of proteomes is not yet established. Here we describe new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell profiling. These technologies will in turn facilitate biological discovery and open new avenues for ultrasensitive disease diagnostics.This Perspective describes new single-molecule protein sequencing and identification technologies alongside innovations in mass spectrometry that will eventually enable broad sequence coverage in single-cell proteomics.</p

    Cooperative Metaheuristics for Exploring Proteomic Data

    Get PDF
    Most combinatorial optimization problems cannotbe solved exactly. A class of methods, calledmetaheuristics, has proved its efficiency togive good approximated solutions in areasonable time. Cooperative metaheuristics area sub-set of metaheuristics, which implies aparallel exploration of the search space byseveral entities with information exchangebetween them. The importance of informationexchange in the optimization process is relatedto the building block hypothesis ofevolutionary algorithms, which is based onthese two questions: what is the pertinentinformation of a given potential solution andhow this information can be shared? Aclassification of cooperative metaheuristicsmethods depending on the nature of cooperationinvolved is presented and the specificproperties of each class, as well as a way tocombine them, is discussed. Severalimprovements in the field of metaheuristics arealso given. In particular, a method to regulatethe use of classical genetic operators and todefine new more pertinent ones is proposed,taking advantage of a building block structuredrepresentation of the explored space. Ahierarchical approach resting on multiplelevels of cooperative metaheuristics is finallypresented, leading to the definition of acomplete concerted cooperation strategy. Someapplications of these concepts to difficultproteomics problems, including automaticprotein identification, biological motifinference and multiple sequence alignment arepresented. For each application, an innovativemethod based on the cooperation concept isgiven and compared with classical approaches.In the protein identification problem, a firstlevel of cooperation using swarm intelligenceis applied to the comparison of massspectrometric data with biological sequencedatabase, followed by a genetic programmingmethod to discover an optimal scoring function.The multiple sequence alignment problem isdecomposed in three steps involving severalevolutionary processes to infer different kindof biological motifs and a concertedcooperation strategy to build the sequencealignment according to their motif conten

    Power and limitations of electrophoretic separations in proteomics strategies

    Get PDF
    Proteomics can be defined as the large-scale analysis of proteins. Due to the complexity of biological systems, it is required to concatenate various separation techniques prior to mass spectrometry. These techniques, dealing with proteins or peptides, can rely on chromatography or electrophoresis. In this review, the electrophoretic techniques are under scrutiny. Their principles are recalled, and their applications for peptide and protein separations are presented and critically discussed. In addition, the features that are specific to gel electrophoresis and that interplay with mass spectrometry (i.e., protein detection after electrophoresis, and the process leading from a gel piece to a solution of peptides) are also discussed

    Computational methods and tools for protein phosphorylation analysis

    Get PDF
    Signaling pathways represent a central regulatory mechanism of biological systems where a key event in their correct functioning is the reversible phosphorylation of proteins. Protein phosphorylation affects at least one-third of all proteins and is the most widely studied posttranslational modification. Phosphorylation analysis is still perceived, in general, as difficult or cumbersome and not readily attempted by many, despite the high value of such information. Specifically, determining the exact location of a phosphorylation site is currently considered a major hurdle, thus reliable approaches are necessary for the detection and localization of protein phosphorylation. The goal of this PhD thesis was to develop computation methods and tools for mass spectrometry-based protein phosphorylation analysis, particularly validation of phosphorylation sites. In the first two studies, we developed methods for improved identification of phosphorylation sites in MALDI-MS. In the first study it was achieved through the automatic combination of spectra from multiple matrices, while in the second study, an optimized protocol for sample loading and washing conditions was suggested. In the third study, we proposed and evaluated the hypothesis that in ESI-MS, tandem CID and HCD spectra of phosphopeptides can be accurately predicted and used in spectral library searching. This novel strategy for phosphosite validation and identification offered accuracy that outperformed the other currently existing popular methods and proved applicable to complex biological samples. And finally, we significantly improved the performance of our command-line prototype tool, added graphical user interface, and options for customizable simulation parameters and filtering of selected spectra, peptides or proteins. The new software, SimPhospho, is open-source and can be easily integrated in a phosphoproteomics data analysis workflow. Together, these bioinformatics methods and tools enable confident phosphosite assignment and improve reliable phosphoproteome identification and reportin

    Proteomic Analysis of Goat Milk

    Get PDF
    The advancement of electrophoresis and chromatography, along with technological developments in mass spectrometry, has widened the potential application of proteomics to study milk from smaller ruminants. The aim of this chapter is to provide an in-depth overview of the development and progress of proteomics applications in goat milk. After examining various proteomic approaches that are currently applied to this field, we narrow our focus on proteomic investigations of mastitis in goat milk. A summary of protein modulation in goat milk during experimentally-induced endotoxin mastitis is discussed. Because the molecular function of proteins is disrupted during disease due to changes in post-translational modifications, we also review the phosphorylation of caseins, which are the predominant phosphoproteins in milk, and discuss the implications of casein modifications during mastitis. These results offer new insights into the changes of protein expression in goat milk during infection

    Mass spectrometry-based proteomics in the life sciences: a review

    Get PDF
    Proteomics concerns itself with the characterization and function of all cellular proteins, the ultimate determinants of cellular function. Mass spectrometry has emerged as the preferred method for in-depth characterization of the protein components of biological systems. Using mass spectrometry, key insights into the composition, regulation and function of molecular complexes and pathways have been gained. Now days, mass spectrometry-based proteomics has become an indispensable tool in the cellular and molecular life sciences. This review discusses current mass spectrometry-based proteomics technologies
    corecore