5,914 research outputs found

    De novo sequencing of multiple tandem mass spectra of peptide containing SILAC labeling

    Get PDF
    The systematic studies of proteins has gradually become fundamental in the research related to molecular biology. Shotgun proteomics use bottom-up proteomics techniques in identifying proteins contained in complex mixtures using a combination of high performance liquid chromatography coupled with mass spectrometry technology. Current mass spectrometers equipped with high sensitivity and accuracy can produce thousands of tandem mass spectrometry (MS/MS) spectra in a single run. The large amount of data collected in a single LC-MS/MS run requires effective computational approaches to automate the process of spectra interpretation. De novo peptide sequencing from tandem mass spectrometry (MS/MS) has emerged as an important technology for peptide sequencing in proteomics. However, the low identification rate of the acquired mass spectral limits the efficiency of computational approaches. To increase the accuracy and practicality of de novo sequencing, some previous algorithms used multiple spectra to identify the peptide sequence. In this thesis, we focus on de novo sequencing of multiple SILAC labeled tandem mass spectra. Compared with previous approach, our research develop de novo sequencing algorithms based on different idea of how to use multiple spectra. SILAC technology uses medium containing different kinds of isotope-labeled essential amino acids, usually Arginine(R) and Lysine(K), to label newly synthesized proteins with stable isotopes during cell growth. Multiple MS/MS spectra for the same peptide sequence are produced by spectrometer after the SILAC samples are processed by LC-MS/MS shotgun proteomics. Based on the factors such as the type of isotope labeling, retention time, precursor ion mass, multiple spectra with different type of SILAC modifications for the same peptide in the sample can be used to identify the peptide sequence. In this study, not only are we aiming to identify the peptide sequence with specific SILAC modifications, but we are also pinpointing locations of SILAC modifications from multiple SILAC labeled MS/MS spectra. We propose two de novo sequencing algorithms to compute the peptide sequence which are based on total number of SILAC modifications and based on the combinations of SILAC modifications of Arginine(R) and Lysine(K). With two dynamic programming algorithms to identify peptide sequence and locating its SILAC modifications, the potential candidates are computed with similarity scores and then refinement algorithms are applied. Finally, a confident score is designed to measure all of the candidate sequence. To verify the performance of our algorithm, we compare the experimental results. We also compare the output candidates between our approach and PEAKS de novo

    HybGFS: a hybrid method for genome-fingerprint scanning

    Get PDF
    BACKGROUND: Protein identification based on mass spectrometry (MS) has previously been performed using peptide mass fingerprinting (PMF) or tandem MS (MS/MS) database searching. However, these methods cannot identify proteins that are not already listed in existing databases. Moreover, the alternative approach of de novo sequencing requires costly equipment and the interpretation of complex MS/MS spectra. Thus, there is a need for novel high-throughput protein-identification methods that are independent of existing predefined protein databases. RESULTS: Here, we present a hybrid method for genome-fingerprint scanning, known as HybGFS. This technique combines genome sequence-based peptide MS/MS ion searching with liquid-chromatography elution-time (LC-ET) prediction, to improve the reliability of identification. The hybrid method allows the simultaneous identification and mapping of proteins without a priori information about their coding sequences. The current study used standard LC-MS/MS data to query an in silico-generated six-reading-frame translation and the enzymatic digest of an entire genome. Used in conjunction with precursor/product ion-mass searching, the LC-ETs increased confidence in the peptide-identification process and reduced the number of false-positive matches. The power of this method was demonstrated using recombinant proteins from the Escherichia coli K12 strain. CONCLUSION: The novel hybrid method described in this study will be useful for the large-scale experimental confirmation of genome coding sequences, without the need for transcriptome-level expression analysis or costly MS database searching

    De novo sequencing of MS/MS spectra

    Get PDF
    Proteomics is the study of proteins, their time- and location-dependent expression profiles, as well as their modifications and interactions. Mass spectrometry is useful to investigate many of the questions asked in proteomics. Database search methods are typically employed to identify proteins from complex mixtures. However, databases are not often available or, despite their availability, some sequences are not readily found therein. To overcome this problem, de novo sequencing can be used to directly assign a peptide sequence to a tandem mass spectrometry spectrum. Many algorithms have been proposed for de novo sequencing and a selection of them are detailed in this article. Although a standard accuracy measure has not been agreed upon in the field, relative algorithm performance is discussed. The current state of the de novo sequencing is assessed thereafter and, finally, examples are used to construct possible future perspectives of the field. © 2011 Expert Reviews Ltd.The Turkish Academy of Science (TÜBA

    Current challenges in software solutions for mass spectrometry-based quantitative proteomics

    Get PDF
    This work was in part supported by the PRIME-XS project, grant agreement number 262067, funded by the European Union seventh Framework Programme; The Netherlands Proteomics Centre, embedded in The Netherlands Genomics Initiative; The Netherlands Bioinformatics Centre; and the Centre for Biomedical Genetics (to S.C., B.B. and A.J.R.H); by NIH grants NCRR RR001614 and RR019934 (to the UCSF Mass Spectrometry Facility, director: A.L. Burlingame, P.B.); and by grants from the MRC, CR-UK, BBSRC and Barts and the London Charity (to P.C.

    De novo sequencing of proteins by mass spectrometry

    Get PDF
    Introduction Proteins are crucial for every cellular activity and unraveling their sequence and structure is a crucial step to fully understand their biology. Early methods of protein sequencing were mainly based on the use of enzymatic or chemical degradation of peptide chains. With the completion of the human genome project and with the expansion of the information available for each protein, various databases containing this sequence information were formed. Areas covered De novo protein sequencing, shotgun proteomics and other mass-spectrometric techniques, along with the various software are currently available for proteogenomic analysis. Emphasis is placed on the methods for de novo sequencing, together with potential and shortcomings using databases for interpretation of protein sequence data. Expert opinion As mass-spectrometry sequencing performance is improving with better software and hardware optimizations, combined with user-friendly interfaces, de-novo protein sequencing becomes imperative in shotgun proteomic studies. Issues regarding unknown or mutated peptide sequences, as well as, unexpected post-translational modifications (PTMs) and their identification through false discovery rate searches using the target/decoy strategy need to be addressed. Ideally, it should become integrated in standard proteomic workflows as an add-on to conventional database search engines, which then would be able to provide improved identification.publishe

    A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass Spectrometry

    Full text link
    The tandem mass spectrometry fragments a large number of molecules of the same peptide sequence into charged prefix and suffix subsequences, and then measures mass/charge ratios of these ions. The de novo peptide sequencing problem is to reconstruct the peptide sequence from a given tandem mass spectral data of k ions. By implicitly transforming the spectral data into an NC-spectrum graph G=(V,E) where |V|=2k+2, we can solve this problem in O(|V|+|E|) time and O(|V|) space using dynamic programming. Our approach can be further used to discover a modified amino acid in O(|V||E|) time and to analyze data with other types of noise in O(|V||E|) time. Our algorithms have been implemented and tested on actual experimental data.Comment: A preliminary version appeared in Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 389--398, 200

    Analysis of the proteinaceous components of the organic matrix of calcitic sclerites from the soft coral Sinularia sp.

    Get PDF
    An organic matrix consisting of a protein-polysaccharide complex is generally accepted as an important medium for the calcification process. While the role this "calcified organic matrix" plays in the calcification process has long been appreciated, the complex mixture of proteins that is induced and assembled during the mineral phase of calcification remains uncharacterized in many organisms. Thus, we investigated organic matrices from the calcitic sclerites of a soft coral, Sinularia sp., and used a proteomic approach to identify the functional matrix proteins that might be involved in the biocalcification process. We purified eight organic matrix proteins and performed in-gel digestion using trypsin. The tryptic peptides were separated by nano-liquid chromatography (nano-LC) and analyzed by tandem mass spectrometry (MS/MS) using a matrix-assisted laser desorption/ionization (MALDI) - time-of-flight-time-of-flight (TOF-TOF) mass spectrometer. Periodic acid Schiff staining of an SDS-PAGE gel indicated that four proteins were glycosylated. We identified several proteins, including a form of actin, from which we identified a total of 183 potential peptides. Our findings suggest that many of those peptides may contribute to biocalcification in soft corals

    MOD(i) : a powerful and convenient web server for identifying multiple post-translational peptide modifications from tandem mass spectra

    Get PDF
    MOD(i) () is a powerful and convenient web service that facilitates the interpretation of tandem mass spectra for identifying post-translational modifications (PTMs) in a peptide. It is powerful in that it can interpret a tandem mass spectrum even when hundreds of modification types are considered and the number of potential PTMs in a peptide is large, in contrast to most of the methods currently available for spectra interpretation that limit the number of PTM sites and types being used for PTM analysis. For example, using MOD(i), one can consider for analysis both the entire PTM list published on the unimod webpage () and user-defined PTMs simultaneously, and one can also identify multiple PTM sites in a spectrum. MOD(i) is convenient in that it can take various input file formats such as .mzXML, .dta, .pkl and .mgf files, and it is equipped with a graphical tool called MassPective developed to display MOD(i)'s output in a user-friendly manner and helps users understand MOD(i)'s output quickly. In addition, one can perform manual de novo sequencing using MassPective
    • …
    corecore