38 research outputs found
Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification
Comparison of novel peptides identified from real proteogenomic databases. (DOCX 68Â kb
High-throughput peptide quantification using mTRAQ reagent triplex
<p>Abstract</p> <p>Background</p> <p>Protein quantification is an essential step in many proteomics experiments. A number of labeling approaches have been proposed and adopted in mass spectrometry (MS) based relative quantification. The mTRAQ, one of the stable isotope labeling methods, is amine-specific and available in triplex format, so that the sample throughput could be doubled when compared with duplex reagents.</p> <p>Methods and results</p> <p>Here we propose a novel data analysis algorithm for peptide quantification in triplex mTRAQ experiments. It improved the accuracy of quantification in two features. First, it identified and separated triplex isotopic clusters of a peptide in each full MS scan. We designed a schematic model of triplex overlapping isotopic clusters, and separated triplex isotopic clusters by solving cubic equations, which are deduced from the schematic model. Second, it automatically determined the elution areas of peptides. Some peptides have similar atomic masses and elution times, so their elution areas can have overlaps. Our algorithm successfully identified the overlaps and found accurate elution areas. We validated our algorithm using standard protein mixture experiments.</p> <p>Conclusions</p> <p>We showed that our algorithm was able to accurately quantify peptides in triplex mTRAQ experiments. Its software implementation is compatible with Trans-Proteomic Pipeline (TPP), and thus enables high-throughput analysis of proteomics data.</p
MOD(i) : a powerful and convenient web server for identifying multiple post-translational peptide modifications from tandem mass spectra
MOD(i) () is a powerful and convenient web service that facilitates the interpretation of tandem mass spectra for identifying post-translational modifications (PTMs) in a peptide. It is powerful in that it can interpret a tandem mass spectrum even when hundreds of modification types are considered and the number of potential PTMs in a peptide is large, in contrast to most of the methods currently available for spectra interpretation that limit the number of PTM sites and types being used for PTM analysis. For example, using MOD(i), one can consider for analysis both the entire PTM list published on the unimod webpage () and user-defined PTMs simultaneously, and one can also identify multiple PTM sites in a spectrum. MOD(i) is convenient in that it can take various input file formats such as .mzXML, .dta, .pkl and .mgf files, and it is equipped with a graphical tool called MassPective developed to display MOD(i)'s output in a user-friendly manner and helps users understand MOD(i)'s output quickly. In addition, one can perform manual de novo sequencing using MassPective
Reinvestigation of aminoacyl-TRNA synthetase core complex by affinity purification-mass spectrometry reveals TARSL2 as a potential member of the complex
10.1371/journal.pone.0081734PLoS ONE812-POLN
An English to Korean Transliteration Model of Extended Markov Window
Automatic transliteration problem is to transcribe foreign words in one's own alphabet. Machine generated transliteration can be useful in various applications such as indexing in an information retrieval system and pronunciation synthesis in a text-to-speech system. In this paper we present a model for statistical Englishto -Korean transliteration that generates transliteration candidates with probability. The model is designed to utilize various information sources by extending a conventional Markov window. Also, an efficient and accurate method for alignment and syllabification of pronunciation units is described. The experimental results show a recall of 0.939 for trained words and 0.875 for untrained words when the best 10 candidates are considered. Introduction As the amount of international communication increases, more foreign words are flooding into the Korean language. Especially in the area of computer and information science, it has been reported that 29.4% of index terms..
Data-Dependent Scoring Parameter Optimization in MS-GF+ Using Spectrum Quality Filter
Most database search tools for proteomics
have their own scoring
parameter sets depending on experimental conditions such as fragmentation
methods, instruments, digestion enzymes, and so on. These scoring
parameter sets are usually predefined by tool developers and cannot
be modified by users. The number of different experimental conditions
grows as the technology develops, and the given set of scoring parameters
could be suboptimal for tandem mass spectrometry data acquired using
new sample preparation or fragmentation methods. Here we introduce
a new approach to optimize scoring parameters in a data-dependent
manner using a spectrum quality filter. The new approach conducts
a preliminary search for the spectra selected by the spectrum quality
filter. Search results from the preliminary search are used to generate
data-dependent scoring parameters; then, the full search over the
entire input spectra is conducted using the learned scoring parameters.
We show that the new approach yields more and better peptide-spectrum
matches than the conventional search using built-in scoring parameters
when compared at the same 1% false discovery rate
Data-Dependent Scoring Parameter Optimization in MS-GF+ Using Spectrum Quality Filter
Most database search tools for proteomics
have their own scoring
parameter sets depending on experimental conditions such as fragmentation
methods, instruments, digestion enzymes, and so on. These scoring
parameter sets are usually predefined by tool developers and cannot
be modified by users. The number of different experimental conditions
grows as the technology develops, and the given set of scoring parameters
could be suboptimal for tandem mass spectrometry data acquired using
new sample preparation or fragmentation methods. Here we introduce
a new approach to optimize scoring parameters in a data-dependent
manner using a spectrum quality filter. The new approach conducts
a preliminary search for the spectra selected by the spectrum quality
filter. Search results from the preliminary search are used to generate
data-dependent scoring parameters; then, the full search over the
entire input spectra is conducted using the learned scoring parameters.
We show that the new approach yields more and better peptide-spectrum
matches than the conventional search using built-in scoring parameters
when compared at the same 1% false discovery rate
Data-Dependent Scoring Parameter Optimization in MS-GF+ Using Spectrum Quality Filter
Most database search tools for proteomics
have their own scoring
parameter sets depending on experimental conditions such as fragmentation
methods, instruments, digestion enzymes, and so on. These scoring
parameter sets are usually predefined by tool developers and cannot
be modified by users. The number of different experimental conditions
grows as the technology develops, and the given set of scoring parameters
could be suboptimal for tandem mass spectrometry data acquired using
new sample preparation or fragmentation methods. Here we introduce
a new approach to optimize scoring parameters in a data-dependent
manner using a spectrum quality filter. The new approach conducts
a preliminary search for the spectra selected by the spectrum quality
filter. Search results from the preliminary search are used to generate
data-dependent scoring parameters; then, the full search over the
entire input spectra is conducted using the learned scoring parameters.
We show that the new approach yields more and better peptide-spectrum
matches than the conventional search using built-in scoring parameters
when compared at the same 1% false discovery rate