68 research outputs found

    Correction of technical bias in clinical microarray data improves concordance with known biological information

    Get PDF
    The performance of gene expression microarrays has been well characterized using controlled reference samples, but the performance on clinical samples remains less clear. We identified sources of technical bias affecting many genes in concert, thus causing spurious correlations in clinical data sets and false associations between genes and clinical variables. We developed a method to correct for technical bias in clinical microarray data, which increased concordance with known biological relationships in multiple data sets

    Patterns in the sequence context of protein disulfide bonds

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Biology, February 2002.Includes bibliographical references (leaves 60-62).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Disulfide bonds play an important role in the structural stability of the proteins that contain them. Yet, little is known about the specificity with which they are formed. To address this, a representative set of disulfide bonds from nonhomologous eukaryotic polypeptides was created. The amino acid sequences flanking these disulfide bonds were searched for conserved patterns that may reflect recognition sites by the disulfide bond forming enzyme protein disulfide isomerase (PDI). Several methods of classifying disulfide bonds were explored, and each class was analyzed for conserved sequence patterns. To maximize the chances of finding a conserved recognition site, a simulated annealing algorithm was implemented to divide a set of disulfide-bonded cysteines into two sets of cysteines with an average sequence environment that is as far from randomly-distributed as possible. No significant conserved patterns were found in the set of disulfide bonds or within any of the classification schemes introduced. Additionally, several methods for predicting disulfide bond connectivity were explored. The most successful methods predicted connectivity based on the sequential distance between cysteines.by Aron Charles Eklund.S.M

    Method for identification of tissue or organ localization of a tumour

    Get PDF
    The invention relates to a method for predicting the localization of a primary tumour, wherein said method comprises the use of genomic profile data, and wherein the method is capable of predicting the type of cancer by a classification score ranking among a variety of the possible tumour types.</p

    An analysis of natural T cell responses to predicted tumor neoepitopes

    Get PDF
    Personalization of cancer immunotherapies such as therapeutic vaccines and adoptive T-cell therapy may benefit from efficient identification and targeting of patient-specific neoepitopes. However, current neoepitope prediction methods based on sequencing and predictions of epitope processing and presentation result in a low rate of validation, suggesting that the determinants of peptide immunogenicity are not well understood. We gathered published data on human neopeptides originating from single amino acid substitutions for which T cell reactivity had been experimentally tested, including both immunogenic and non-immunogenic neopeptides. Out of 1,948 neopeptide-HLA (human leukocyte antigen) combinations from 13 publications, 53 were reported to elicit a T cell response. From these data, we found an enrichment for responses among peptides of length 9. Even though the peptides had been pre-selected based on presumed likelihood of being immunogenic, we found using NetMHCpan-4.0 that immunogenic neopeptides were predicted to bind significantly more strongly to HLA compared to non-immunogenic peptides. Investigation of the HLA binding strength of the immunogenic peptides revealed that the vast majority (96%) shared very strong predicted binding to HLA and that the binding strength was comparable to that observed for pathogen-derived epitopes. Finally, we found that neopeptide dissimilarity to self is a predictor of immunogenicity in situations where neo- and normal peptides share comparable predicted binding strength. In conclusion, these results suggest new strategies for prioritization of mutated peptides, but new data will be needed to confirm their value.Fil: Bjerregaard, Anne-Mette. Technical University of Denmark; DinamarcaFil: Nielsen, Morten. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentina. Technical University of Denmark; DinamarcaFil: Jurtz, Vanessa. Technical University of Denmark; DinamarcaFil: Barra, Carolina M.. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; ArgentinaFil: Hadrup, Sine Reker. Technical University of Denmark; DinamarcaFil: Szallasi, Zoltan. Technical University of Denmark; Dinamarca. Harvard Medical School; Estados UnidosFil: Eklund, Aron Charles. Technical University of Denmark; Dinamarc

    TumorTracer: a method to identify the tissue of origin from the somatic mutations of a tumor specimen

    Get PDF
    International audienceBACKGROUND:A substantial proportion of cancer cases present with a metastatic tumor and require further testing to determine the primary site; many of these are never fully diagnosed and remain cancer of unknown primary origin (CUP). It has been previously demonstrated that the somatic point mutations detected in a tumor can be used to identify its site of origin with limited accuracy. We hypothesized that higher accuracy could be achieved by a classification algorithm based on the following feature sets: 1) the number of nonsynonymous point mutations in a set of 232 specific cancer-associated genes, 2) frequencies of the 96 classes of single-nucleotide substitution determined by the flanking bases, and 3) copy number profiles, if available.METHODS:We used publicly available somatic mutation data from the COSMIC database to train random forest classifiers to distinguish among those tissues of origin for which sufficient data was available. We selected feature sets using cross-validation and then derived two final classifiers (with or without copy number profiles) using 80 % of the available tumors. We evaluated the accuracy using the remaining 20 %. For further validation, we assessed accuracy of the without-copy-number classifier on three independent data sets: 1669 newly available public tumors of various types, a cohort of 91 breast metastases, and a set of 24 specimens from 9 lung cancer patients subjected to multiregion sequencing.RESULTS:The cross-validation accuracy was highest when all three types of information were used. On the left-out COSMIC data not used for training, we achieved a classification accuracy of 85 % across 6 primary sites (with copy numbers), and 69 % across 10 primary sites (without copy numbers). Importantly, a derived confidence score could distinguish tumors that could be identified with 95 % accuracy (32 %/75 % of tumors with/without copy numbers) from those that were less certain. Accuracy in the independent data sets was 46 %, 53 % and 89 % respectively, similar to the accuracy expected from the training data.CONCLUSIONS:Identification of primary site from point mutation and/or copy number data may be accurate enough to aid clinical diagnosis of cancers of unknown primary origin

    Predictive biomarker discovery through the parallel integration of clinical trial and functional genomics datasets

    Get PDF
    The European Union multi-disciplinary Personalised RNA interference to Enhance the Delivery of Individualised Cytotoxic and Targeted therapeutics (PREDICT) consortium has recently initiated a framework to accelerate the development of predictive biomarkers of individual patient response to anti-cancer agents. The consortium focuses on the identification of reliable predictive biomarkers to approved agents with anti-angiogenic activity for which no reliable predictive biomarkers exist: sunitinib, a multi-targeted tyrosine kinase inhibitor and everolimus, a mammalian target of rapamycin (mTOR) pathway inhibitor. Through the analysis of tumor tissue derived from pre-operative renal cell carcinoma (RCC) clinical trials, the PREDICT consortium will use established and novel methods to integrate comprehensive tumor-derived genomic data with personalized tumor-derived small hairpin RNA and high-throughput small interfering RNA screens to identify and validate functionally important genomic or transcriptomic predictive biomarkers of individual drug response in patients. PREDICT's approach to predictive biomarker discovery differs from conventional associative learning approaches, which can be susceptible to the detection of chance associations that lead to overestimation of true clinical accuracy. These methods will identify molecular pathways important for survival and growth of RCC cells and particular targets suitable for therapeutic development. Importantly, our results may enable individualized treatment of RCC, reducing ineffective therapy in drug-resistant disease, leading to improved quality of life and higher cost efficiency, which in turn should broaden patient access to beneficial therapeutics, thereby enhancing clinical outcome and cancer survival. The consortium will also establish and consolidate a European network providing the technological and clinical platform for large-scale functional genomic biomarker discovery. Here we review our current understanding of molecular mechanisms driving resistance to anti-angiogenesis agents, the current limitations of laboratory and clinical trial strategies and how the PREDICT consortium will endeavor to identify a new generation of predictive biomarkers

    MECP2 Is a Frequently Amplified Oncogene with a Novel Epigenetic Mechanism That Mimics the Role of Activated RAS in Malignancy

    Get PDF
    An unbiased genome-scale screen for unmutated genes that drive cancer growth when overexpressed identified MECP2 as a novel oncogene. MECP2 resides in a region of the X-chromosome that is significantly amplified across 18% of cancers, and many cancer cell lines have amplified, overexpressed MECP2 and are dependent on MECP2 expression for growth. MECP2 copy number gain and RAS family member alterations are mutually exclusive in several cancer types. The MECP2 splicing isoforms activate the major growth factor pathways targeted by activated RAS, the MAPK and PI3K pathways. MECP2 rescued the growth of a KRAS(G12C)-addicted cell line after KRAS down-regulation, and activated KRAS rescues the growth of an MECP2-addicted cell line after MECP2 downregulation. MECP2 binding to the epigenetic modification 5-hydroxymethylcytosine is required for efficient transformation. These observations suggest that MECP2 is a commonly amplified oncogene with an unusual epigenetic mode of action

    Minimising Immunohistochemical False Negative ER Classification Using a Complementary 23 Gene Expression Signature of ER Status

    Get PDF
    BACKGROUND: Expression of the oestrogen receptor (ER) in breast cancer predicts benefit from endocrine therapy. Minimising the frequency of false negative ER status classification is essential to identify all patients with ER positive breast cancers who should be offered endocrine therapies in order to improve clinical outcome. In routine oncological practice ER status is determined by semi-quantitative methods such as immunohistochemistry (IHC) or other immunoassays in which the ER expression level is compared to an empirical threshold. The clinical relevance of gene expression-based ER subtypes as compared to IHC-based determination has not been systematically evaluated. Here we attempt to reduce the frequency of false negative ER status classification using two gene expression approaches and compare these methods to IHC based ER status in terms of predictive and prognostic concordance with clinical outcome. METHODOLOGY/PRINCIPAL FINDINGS: Firstly, ER status was discriminated by fitting the bimodal expression of ESR1 to a mixed Gaussian model. The discriminative power of ESR1 suggested bimodal expression as an efficient way to stratify breast cancer; therefore we identified a set of genes whose expression was both strongly bimodal, mimicking ESR expression status, and highly expressed in breast epithelial cell lines, to derive a 23-gene ER expression signature-based classifier. We assessed our classifiers in seven published breast cancer cohorts by comparing the gene expression-based ER status to IHC-based ER status as a predictor of clinical outcome in both untreated and tamoxifen treated cohorts. In untreated breast cancer cohorts, the 23 gene signature-based ER status provided significantly improved prognostic power compared to IHC-based ER status (P = 0.006). In tamoxifen-treated cohorts, the 23 gene ER expression signature predicted clinical outcome (HR = 2.20, P = 0.00035). These complementary ER signature-based strategies estimated that between 15.1% and 21.8% patients of IHC-based negative ER status would be classified with ER positive breast cancer. CONCLUSION/SIGNIFICANCE: Expression-based ER status classification may complement IHC to minimise false negative ER status classification and optimise patient stratification for endocrine therapies
    corecore