102,301 research outputs found

    Mining Oncology Data: Knowledge Discovery in Clinical Performance of Cancer Patients

    Get PDF
    Our goal in this research is twofold: to develop clinical performance databases of cancer patients, and to conduct data mining and machine learning studies on collected patient records. We use these studies to develop models for predicting cancer patient medical outcomes. The clinical database is developed in conjunction with surgeons and oncologists at UMass Memorial Hospital. Aspects of the database design and representation of patient narrative are discussed here. Current predictive model design in medical literature is dominated by linear and logistic regression techniques. We seek to show that novel machine learning methods can perform as well or better than these traditional techniques. Our machine learning focus for this thesis is on pancreatic cancer patients. Classification and regression prediction targets include patient survival, wellbeing scores, and disease characteristics. Information research in oncology is often constrained by type variation, missing attributes, high dimensionality, skewed class distribution, and small data sets. We compensate for these difficulties using preprocessing, meta-learning, and other algorithmic methods during data analysis. The predictive accuracy and regression error of various machine learning models are presented as results, as are t-tests comparing these to the accuracy of traditional regression methods. In most cases, it is shown that the novel machine learning prediction methods offer comparable or superior performance. We conclude with an analysis of results and discussion of future research possibilities

    Network-based stratification of tumor mutations.

    Get PDF
    Many forms of cancer have multiple subtypes with different causes and clinical outcomes. Somatic tumor genome sequences provide a rich new source of data for uncovering these subtypes but have proven difficult to compare, as two tumors rarely share the same mutations. Here we introduce network-based stratification (NBS), a method to integrate somatic tumor genomes with gene networks. This approach allows for stratification of cancer into informative subtypes by clustering together patients with mutations in similar network regions. We demonstrate NBS in ovarian, uterine and lung cancer cohorts from The Cancer Genome Atlas. For each tissue, NBS identifies subtypes that are predictive of clinical outcomes such as patient survival, response to therapy or tumor histology. We identify network regions characteristic of each subtype and show how mutation-derived subtypes can be used to train an mRNA expression signature, which provides similar information in the absence of DNA sequence

    Salivary biomarker development using genomic, proteomic and metabolomic approaches.

    Get PDF
    The use of saliva as a diagnostic sample provides a non-invasive, cost-efficient method of sample collection for disease screening without the need for highly trained professionals. Saliva collection is far more practical and safe compared with invasive methods of sample collection, because of the infection risk from contaminated needles during, for example, blood sampling. Furthermore, the use of saliva could increase the availability of accurate diagnostics for remote and impoverished regions. However, the development of salivary diagnostics has required technical innovation to allow stabilization and detection of analytes in the complex molecular mixture that is saliva. The recent development of cost-effective room temperature analyte stabilization methods, nucleic acid pre-amplification techniques and direct saliva transcriptomic analysis have allowed accurate detection and quantification of transcripts found in saliva. Novel protein stabilization methods have also facilitated improved proteomic analyses. Although candidate biomarkers have been discovered using epigenetic, transcriptomic, proteomic and metabolomic approaches, transcriptomic analyses have so far achieved the most progress in terms of sensitivity and specificity, and progress towards clinical implementation. Here, we review recent developments in salivary diagnostics that have been accomplished using genomic, transcriptomic, proteomic and metabolomic approaches

    Urinary CE-MS peptide marker pattern for detection of solid tumors

    Get PDF
    Urinary profiling datasets, previously acquired by capillary electrophoresis coupled to mass-spectrometry were investigated to identify a general urinary marker pattern for detection of solid tumors by targeting common systemic events associated with tumor-related inflammation. A total of 2,055 urinary profiles were analyzed, derived from a) a cancer group of patients (n = 969) with bladder, prostate, and pancreatic cancers, renal cell carcinoma, and cholangiocarcinoma and b) a control group of patients with benign diseases (n = 556), inflammatory diseases (n = 199) and healthy individuals (n = 331). Statistical analysis was conducted in a discovery set of 676 cancer cases and 744 controls. 193 peptides differing at statistically significant levels between cases and controls were selected and combined to a multi-dimensional marker pattern using support vector machine algorithms. Independent validation in a set of 635 patients (293 cancer cases and 342 controls) showed an AUC of 0.82. Inclusion of age as independent variable, significantly increased the AUC value to 0.85. Among the identified peptides were mucins, fibrinogen and collagen fragments. Further studies are planned to assess the pattern value to monitor patients for tumor recurrence. In this proof-of-concept study, a general tumor marker pattern was developed to detect cancer based on shared biomarkers, likely indicative of cancer-related features
    • …
    corecore