210 research outputs found

    Revisiting inconsistency in large pharmacogenomic studies

    Get PDF
    In 2013, we published a comparative analysis of mutation and gene expression profiles and drug sensitivity measurements for 15 drugs characterized in the 471 cancer cell lines screened in the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE). While we found good concordance in gene expression profiles, there was substantial inconsistency in the drug responses reported by the GDSC and CCLE projects. We received extensive feedback on the comparisons that we performed. This feedback, along with the release of new data, prompted us to revisit our initial analysis. We present a new analysis using these expanded data, where we address the most significant suggestions for improvements on our published analysis - that targeted therapies and broad cytotoxic drugs should have been treated differently in assessing consistency, that consistency of both molecular profiles and drug sensitivity measurements should be compared across cell lines, and that the software analysis tools provided should have been easier to run, particularly as the GDSC and CCLE released additional data. Our re-analysis supports our previous finding that gene expression data are significantly more consistent than drug sensitivity measurements. Using new statistics to assess data consistency allowed identification of two broad effect drugs and three targeted drugs with moderate to good consistency in drug sensitivity data between GDSC and CCLE. For three other targeted drugs, there were not enough sensitive cell lines to assess the consistency of the pharmacological profiles. We found evidence of inconsistencies in pharmacological phenotypes for the remaining eight drugs. Overall, our findings suggest that the drug sensitivity data in GDSC and CCLE continue to present challenges for robust biomarker discovery. This re-analysis provides additional support for the argument that experimental standardization and validation of pharmacogenomic response will be necessary to advance the broad use of large pharmacogenomic screens

    Multiple-input multiple-output causal strategies for gene selection

    Get PDF
    Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting.Journal ArticleResearch Support, N.I.H. ExtramuralResearch Support, Non-U.S. Gov'tSCOPUS: ar.jinfo:eu-repo/semantics/publishe

    Public data and open source tools for multi-assay genomic investigation of disease

    Get PDF
    Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods

    Proliferation and estrogen signaling can distinguish patients at risk for early versus late relapse among estrogen receptor positive breast cancers

    Get PDF
    Introduction: We examined if a combination of proliferation markers and estrogen receptor (ER) activity could predict early versus late relapses in ER-positive breast cancer and inform the choice and length of adjuvant endocrine therapy. Methods: Baseline affymetrix gene-expression profiles from ER-positive patients who received no systemic therapy (n = 559), adjuvant tamoxifen for 5 years (cohort-1: n = 683, cohort-2: n = 282) and from 58 patients treated with neoadjuvant letrozole for 3 months (gene-expression available at baseline, 14 and 90 days) were analyzed. A proliferation score based on the expression of mitotic kinases (MKS) and an ER-related score (ERS) adopted from Oncotype DX® were calculated. The same analysis was performed using the Genomic Grade Index as proliferation marker and the luminal gene score from the PAM50 classifier as measure of estrogen-related genes. Median values were used to define low and high marker groups and four combinations were created. Relapses were grouped into time cohorts of 0-2.5, 0-5, 5-10 years. Results: In the overall 10 years period, the proportional hazards assumption was violated for several biomarker groups indicating time-dependent effects. In tamoxifen-treated patients Low-MKS/Low-ERS cancers had continuously increasing risk of relapse that was higher after 5 years than Low-MKS/High-ERS cancers [0 to 10 year, HR 3.36; p = 0.013]. High-MKS/High-ERS cancers had low risk of early relapse [0-2.5 years HR 0.13; p = 0.0006], but high risk of late relapse which was higher than in the High-MKS/Low-ERS group [after 5 years HR 3.86; p = 0.007]. The High-MKS/Low-ERS subset had most of the early relapses [0 to 2.5 years, HR 6.53; p < 0.0001] especially in node negative tumors and showed minimal response to neoadjuvant letrozole. These findings were qualitatively confirmed in a smaller independent cohort of tamoxifen-treated patients. Using different biomarkers provided similar results. Conclusions: Early relapses are highest in highly proliferative/low-ERS cancers, in particular in node negative tumors. Relapses occurring after 5 years of adjuvant tamoxifen are highest among the highly-proliferative/high-ERS tumors although their risk of recurrence is modest in the first 5 years on tamoxifen. These tumors could be the best candidates for extended endocrine therapy

    Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach

    Get PDF
    Human cancers exhibit strong phenotypic differences that can be visualized noninvasively by medical imaging. Radiomics refers to the comprehensive quantification of tumour phenotypes by applying a large number of quantitative image features. Here we present a radiomic analysis of 440 features quantifying tumour image intensity, shape and texture, which are extracted from computed tomography data of 1,019 patients with lung or head-and-neck cancer. We find that a large number of radiomic features have prognostic power in independent data sets of lung and head-and-neck cancer patients, many of which were not identified as significant before. Radiogenomics analysis reveals that a prognostic radiomic signature, capturing intratumour heterogeneity, is associated with underlying gene-expression patterns. These data suggest that radiomics identifies a general prognostic phenotype existing in both lung and head-and-neck cancer. This may have a clinical impact as imaging is routinely used in clinical practice, providing an unprecedented opportunity to improve decision-support in cancer treatment at low cost

    E2F1 and KIAA0191 expression predicts breast cancer patient survival

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression profiling of human breast tumors has uncovered several molecular signatures that can divide breast cancer patients into good and poor outcome groups. However, these signatures typically comprise many genes (~50-100), and the prognostic tests associated with identifying these signatures in patient tumor specimens require complicated methods, which are not routinely available in most hospital pathology laboratories, thus limiting their use. Hence, there is a need for more practical methods to predict patient survival.</p> <p>Methods</p> <p>We modified a feature selection algorithm and used survival analysis to derive a 2-gene signature that accurately predicts breast cancer patient survival.</p> <p>Results</p> <p>We developed a tree based decision method that segregated patients into various risk groups using <it>KIAA0191 </it>expression in the context of <it>E2F1 </it>expression levels. This approach led to highly accurate survival predictions in a large cohort of breast cancer patients using only a 2-gene signature.</p> <p>Conclusions</p> <p>Our observations suggest a possible relationship between <it>E2F1 </it>and <it>KIAA0191 </it>expression that is relevant to the pathogenesis of breast cancer. Furthermore, our findings raise the prospect that the practicality of patient prognosis methods may be improved by reducing the number of genes required for analysis. Indeed, our <it>E2F1/KIAA0191 </it>2-gene signature would be highly amenable for an immunohistochemistry based test, which is commonly used in hospital laboratories.</p

    Stromal Genes Add Prognostic Information to Proliferation and Histoclinical Markers: A Basis for the Next Generation of Breast Cancer Gene Signatures

    Get PDF
    BACKGROUND: First-generation gene signatures that identify breast cancer patients at risk of recurrence are confined to estrogen-positive cases and are driven by genes involved in the cell cycle and proliferation. Previously we induced sets of stromal genes that are prognostic for both estrogen-positive and estrogen-negative samples. Creating risk-management tools that incorporate these stromal signatures, along with existing proliferation-based signatures and established clinicopathological measures such as lymph node status and tumor size, should better identify women at greatest risk for metastasis and death. METHODOLOGY/PRINCIPAL FINDINGS: To investigate the strength and independence of the stromal and proliferation factors in estrogen-positive and estrogen-negative patients we constructed multivariate Cox proportional hazards models along with tree-based partitions of cancer cases for four breast cancer cohorts. Two sets of stromal genes, one consisting of DCN and FBLN1, and the other containing LAMA2, add substantial prognostic value to the proliferation signal and to clinical measures. For estrogen receptor-positive patients, the stromal-decorin set adds prognostic value independent of proliferation for three of the four datasets. For estrogen receptor-negative patients, the stromal-laminin set significantly adds prognostic value in two datasets, and marginally in a third. The stromal sets are most prognostic for the unselected population studies and may depend on the age distribution of the cohorts. CONCLUSION: The addition of stromal genes would measurably improve the performance of proliferation-based first-generation gene signatures, especially for older women. Incorporating indicators of the state of stromal cell types would mark a conceptual shift from epithelial-centric risk assessment to assessment based on the multiple cell types in the cancer-altered tissue

    GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

    Get PDF
    GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org
    corecore