19 research outputs found

    msmsEval: tandem mass spectral quality assignment for high-throughput proteomics

    Get PDF
    BACKGROUND: In proteomics experiments, database-search programs are the method of choice for protein identification from tandem mass spectra. As amino acid sequence databases grow however, computing resources required for these programs have become prohibitive, particularly in searches for modified proteins. Recently, methods to limit the number of spectra to be searched based on spectral quality have been proposed by different research groups, but rankings of spectral quality have thus far been based on arbitrary cut-off values. In this work, we develop a more readily interpretable spectral quality statistic by providing probability values for the likelihood that spectra will be identifiable. RESULTS: We describe an application, msmsEval, that builds on previous work by statistically modeling the spectral quality discriminant function using a Gaussian mixture model. This allows a researcher to filter spectra based on the probability that a spectrum will ultimately be identified by database searching. We show that spectra that are predicted by msmsEval to be of high quality, yet remain unidentified in standard database searches, are candidates for more intensive search strategies. Using a well studied public dataset we also show that a high proportion (83.9%) of the spectra predicted by msmsEval to be of high quality but that elude standard search strategies, are in fact interpretable. CONCLUSION: msmsEval will be useful for high-throughput proteomics projects and is freely available for download from . Supports Windows, Mac OS X and Linux/Unix operating systems

    ETISEQ – an algorithm for automated elution time ion sequencing of concurrently fragmented peptides for mass spectrometry-based proteomics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Concurrent peptide fragmentation (i.e. shotgun CID, parallel CID or MS<sup>E</sup>) has emerged as an alternative to data-dependent acquisition in generating peptide fragmentation data in LC-MS/MS proteomics experiments. Concurrent peptide fragmentation data acquisition has been shown to be advantageous over data-dependent acquisition by providing greater detection dynamic range and providing more accurate quantitative information. Nevertheless, concurrent peptide fragmentation data acquisition remains to be widely adopted due to the lack of published algorithms designed specifically to process or interpret such data acquired on any mass spectrometer.</p> <p>Results</p> <p>An algorithm called Elution Time Ion Sequencing (ETISEQ), has been developed to enable automated conversion of concurrent peptide fragmentation data acquisition data to LC-MS/MS data. ETISEQ generates MS/MS-like spectra based on the correlation of precursor and product ion elution profiles. The performance of ETISEQ is demonstrated using concurrent peptide fragmentation data from tryptic digests of standard proteins and whole influenza virus. It is shown that the number of unique peptides identified from the digests is broadly comparable between ETISEQ processed concurrent peptide fragmentation data and the data-dependent acquired LC-MS/MS data.</p> <p>Conclusion</p> <p>The ETISEQ algorithm has been designed for easy integration with existing MS/MS analysis platforms. It is anticipated that it will popularize concurrent peptide fragmentation data acquisition in proteomics laboratories.</p

    FluShuffle and FluResort: new algorithms to identify reassorted strains of the influenza virus by mass spectrometry

    No full text
    Abstract Background Influenza is one of the oldest and deadliest infectious diseases known to man. Reassorted strains of the virus pose the greatest risk to both human and animal health and have been associated with all pandemics of the past century, with the possible exception of the 1918 pandemic, resulting in tens of millions of deaths. We have developed and tested new computer algorithms, FluShuffle and FluResort, which enable reassorted viruses to be identified by the most rapid and direct means possible. These algorithms enable reassorted influenza, and other, viruses to be rapidly identified to allow prevention strategies and treatments to be more efficiently implemented. Results The FluShuffle and FluResort algorithms were tested with both experimental and simulated mass spectra of whole virus digests. FluShuffle considers different combinations of viral protein identities that match the mass spectral data using a Gibbs sampling algorithm employing a mixed protein Markov chain Monte Carlo (MCMC) method. FluResort utilizes those identities to calculate the weighted distance of each across two or more different phylogenetic trees constructed through viral protein sequence alignments. Each weighted mean distance value is normalized by conversion to a Z-score to establish a reassorted strain. Conclusions The new FluShuffle and FluResort algorithms can correctly identify the origins of influenza viral proteins and the number of reassortment events required to produce the strains from the high resolution mass spectral data of whole virus proteolytic digestions. This has been demonstrated in the case of constructed vaccine strains as well as common human seasonal strains of the virus. The algorithms significantly improve the capability of the proteotyping approach to identify reassorted viruses that pose the greatest pandemic risk.</p

    Disruption of a GATA2-TAL1-ERG regulatory circuit promotes erythroid transition in healthy and leukemic stem cells.

    No full text
    Changes in gene regulation and expression govern orderly transitions from hematopoietic stem cells to terminally differentiated blood cell types. These transitions are disrupted during leukemic transformation, but knowledge of the gene regulatory changes underpinning this process is elusive. We hypothesized that identifying core gene regulatory networks in healthy hematopoietic and leukemic cells could provide insights into network alterations that perturb cell state transitions. A heptad of transcription factors (LYL1, TAL1, LMO2, FLI1, ERG, GATA2, and RUNX1) bind key hematopoietic genes in human CD34+ hematopoietic stem and progenitor cells (HSPCs) and have prognostic significance in acute myeloid leukemia (AML). These factors also form a densely interconnected circuit by binding combinatorially at their own, and each other's, regulatory elements. However, their mutual regulation during normal hematopoiesis and in AML cells, and how perturbation of their expression levels influences cell fate decisions remains unclear. In this study, we integrated bulk and single-cell data and found that the fully connected heptad circuit identified in healthy HSPCs persists, with only minor alterations in AML, and that chromatin accessibility at key heptad regulatory elements was predictive of cell identity in both healthy progenitors and leukemic cells. The heptad factors GATA2, TAL1, and ERG formed an integrated subcircuit that regulates stem cell-to-erythroid transition in both healthy and leukemic cells. Components of this triad could be manipulated to facilitate erythroid transition providing a proof of concept that such regulatory circuits can be harnessed to promote specific cell-type transitions and overcome dysregulated hematopoiesis.Wellcome Investigator award (206328/Z/17/Z

    FluTyper-an algorithm for automated typing and subtyping of the influenza virus from high resolution mass spectral data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High resolution mass spectrometry has been employed to rapidly and accurately type and subtype influenza viruses. The detection of signature peptides with unique theoretical masses enables the unequivocal assignment of the type and subtype of a given strain. This analysis has, to date, required the manual inspection of mass spectra of whole virus and antigen digests.</p> <p>Results</p> <p>A computer algorithm, FluTyper, has been designed and implemented to achieve the automated analysis of MALDI mass spectra recorded for proteolytic digests of the whole influenza virus and antigens. FluTyper incorporates the use of established signature peptides and newly developed naĂŻve Bayes classifiers for four common influenza antigens, hemagglutinin, neuraminidase, nucleoprotein, and matrix protein 1, to type and subtype the influenza virus based on their detection within proteolytic peptide mass maps. Theoretical and experimental testing of the classifiers demonstrates their applicability at protein coverage rates normally achievable in mass mapping experiments. The application of FluTyper to whole virus and antigen digests of a range of different strains of the influenza virus is demonstrated.</p> <p>Conclusions</p> <p>FluTyper algorithm facilitates the rapid and automated typing and subtyping of the influenza virus from mass spectral data. The newly developed naĂŻve Bayes classifiers increase the confidence of influenza virus subtyping, especially where signature peptides are not detected. FluTyper is expected to popularize the use of mass spectrometry to characterize influenza viruses.</p
    corecore