16 research outputs found

    Lessons learned: Linking patient-reported outcomes data with administrative databases

    Get PDF
    Introduction Since 2007, Cancer Care Ontario (CCO) has systematically collected patient-reported outcomes (PROs) in the form of symptom data, for cancer outpatients visiting regional cancer centres or affiliate institutions. Data are used in real-time to facilitate conversation between clinicians and patients and have recently been combined with provincial administrative databases. Objectives and Approach CCO collects PROs using the Edmonton Symptom Assessment System (ESAS), which scores 9 symptoms on a scale of 0 (no symptoms) to 10 (worst symptom severity). Data were imported from CCO in 2015 and linked to a cancer cohort at ICES. We investigated differences between patients who completed ≥\geq1 ESAS record and patients who did not, as well as the number of records, timing of data collection and missingness. We describe our experience linking and using the PRO data to administrative data, including presenting trajectories of symptoms over time and combining scores into composite indices. Results 120,745 cancer patients had 729,861 symptom records between 2007 and 2014. Not all patients with a cancer diagnosis had ≥\geq1 ESAS record and this varied by patient, disease and system level factors. Because implementation occurred from a clinical perspective, data collection was irregular within and across patients and depended on treatment and other factors; the number of records per patient varied, as well the number of contributing patients in each time period following diagnosis. Attempts were made to create meaningful composite indices by combining all symptom scores as well as combining multiple high scores for each individual symptom. As a result, selecting the best statistical analysis to use these PRO data as an exposure or outcome is still uncertain. Conclusion/Implications PRO data linked to provincial, administrative data holdings represent a new frontier for population-based cancer research, both in their challenging structure as well as their implications for clinical practice and health system. These lessons learned will hopefully support other researchers rigorous use of these data in the future

    Investigating Associations Between Preoperative Patient-Reported Symptom Burden and Postoperative Outcomes Following Major Cancer Surgery: A Retrospective Cohort Study

    No full text
    Patient-reported outcomes (PROs) are prognostic of long-term survival in cancer patients. However, their association with postoperative outcomes following major oncologic surgery is not well characterized. A retrospective population-based cohort study of rectal cancer patients undergoing neoadjuvant radiotherapy and proctectomy was conducted. Receiver operating characteristic analysis was used to select a scoring approach for the Edmonton Symptom Assessment System to define elevated preoperative symptom burden. Multivariable regression analyses were conducted to investigate associations between preoperative symptom scores and postoperative outcomes. High preoperative symptom scores were not associated with postoperative major morbidity (OR 1.28, 95% CI 0.84-1.97). However, high preoperative symptom scores were associated with prolonged postoperative length of stay (IRR 1.23, 95% CI 1.14-1.32), 30-day hospital readmission (OR 1.74, 95% CI 1.30-2.34), and 30-day post-discharge ED visits (OR 1.34, 95% CI 1.05-1.71). PROs can contribute important information for identification of patients at risk for increased healthcare utilization in the postoperative period.M.Sc

    A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics.

    No full text
    Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark

    A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics

    No full text
    Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark

    MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity

    No full text
    In this work, we present the results of evaluation of a workflow that employs a multienzyme digestion strategy for MS1-based protein identification in “shotgun” proteomic applications. In the proposed strategy, several cleavage reagents of different specificity were used for parallel digestion of the protein sample followed by MS1 and retention time (RT) based search. Proof of principle for the proposed strategy was performed using experimental data obtained for the annotated 48-protein standard. By using the developed approach, up to 90% of proteins from the standard were unambiguously identified. The approach was further applied to HeLa proteome data. For the sample of this complexity, the proposed MS1-only strategy determined correctly up to 34% of all proteins identified using standard MS/MS-based database search. It was also found that the results of MS1-only search were independent of the chromatographic gradient time in a wide range of gradients from 15–120 min. Potentially, rapid MS1-only proteome characterization can be an alternative or complementary to the MS/MS-based “shotgun” analyses in the studies, in which the experimental time is more important than the depth of the proteome coverage

    MS/MS-Free Protein Identification in Complex Mixtures Using Multiple Enzymes with Complementary Specificity

    No full text
    In this work, we present the results of evaluation of a workflow that employs a multienzyme digestion strategy for MS1-based protein identification in “shotgun” proteomic applications. In the proposed strategy, several cleavage reagents of different specificity were used for parallel digestion of the protein sample followed by MS1 and retention time (RT) based search. Proof of principle for the proposed strategy was performed using experimental data obtained for the annotated 48-protein standard. By using the developed approach, up to 90% of proteins from the standard were unambiguously identified. The approach was further applied to HeLa proteome data. For the sample of this complexity, the proposed MS1-only strategy determined correctly up to 34% of all proteins identified using standard MS/MS-based database search. It was also found that the results of MS1-only search were independent of the chromatographic gradient time in a wide range of gradients from 15–120 min. Potentially, rapid MS1-only proteome characterization can be an alternative or complementary to the MS/MS-based “shotgun” analyses in the studies, in which the experimental time is more important than the depth of the proteome coverage

    IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics

    No full text
    We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel “autotune” feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Autotune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the autotune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in postprocessing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms and allows customization of the search workflow for specialized applications
    corecore