22 research outputs found

    Fig 7 -

    No full text
    ROC (left) and Precision-Recall (right) curves for closed search. Both ROC curves and Precision-Recall curves for SpeCollate, XCorr, and Hyperscore show that SpeCollate performs best for all cutoff values for closed search (±0.5 Da precursor mass tolerance).</p

    A generic proteomics flow.

    No full text
    In-silico digestion of the protein database is performed to generate peptides. These peptides are then converted to the theoretical spectra and compared against the experimental spectra.</p

    Space transition methods.

    No full text
    De novo and database search, that try to transform one space to another. This is prone to error and uncertainty as a lot of information can be missed. On the contrary, SpeCollate learns same sized embeddings for both peptides and spectra by projecting them to a shared euclidean space.</p

    Modifications and the character values used in the training data.

    No full text
    Nine modifications are used for training the network which are encoded with their corresponding character values to construct and modified peptide string.</p

    Peptides.

    No full text
    List of peptides identified for different experiments. (XLSX)</p

    Comparison.

    No full text
    Comparison of SpeCollate with Default XCorr and MSFragger Settings. (DOCX)</p

    Venn diagrams showing the overlap of peptides among SpeCollate, Crux, and MSFragger.

    No full text
    As can be seen, most of the peptides are common among the three tools while SpeCollate does identify most unique peptides compared to Crux and MSFragger. See S4 File for list of peptides identified in each experiment.</p

    2D UMAP visualization of embedded spectra and peptide vectors generated by SpeCollate.

    No full text
    a) Clustering of spectra with different precursor charges around their corresponding peptides as well as the separation of clusters within a mass-range is shown for 40 unique peptides. The red point represents the peptide within a cluster, while the blue points represent spectra (with different charge values) corresponding to the peptide. Sub-figures b) and c) show the close-up of two clusters with charge labeling of spectra. As can be seen, spectra with all charges are mapped close to the peptide. However, spectra with higher precursor charge values are mapped relatively farther than those with smaller values.</p

    Training parameters for SpeCollate.

    No full text
    Training parameters for SpeCollate.</p

    Closed search (±5ppm) comparison against Crux and MSFragger using real-world data from PRIDE repository PXD000612 (top), PXD009861 (middle), and PXD001468 (bottom).

    No full text
    For PXD000612, the search is performed against a peptide database generated from the human proteome with zero and up to one and two phosphorylation (amino acids S, T, and Y) modifications for subplots left, middle, and right, respectively. For dataset PXD009861, the search is performed against databases with a different number of oxidized M-residues (up to two per peptide), while for dataset PXD001468, the search is performed against a database with one n-term acetylation and up to two oxidation sites (max two modifications per peptide). SpeCollate can outperform Crux and MSFragger in terms of PSMs while giving a comparable performance in terms of the number of peptides.</p
    corecore