19 research outputs found
Integration of Gene Dosage and Gene Expression in Non-Small Cell Lung Cancer, Identification of HSP90 as Potential Target
BACKGROUND: Lung cancer causes approximately 1.2 million deaths per year worldwide, and non-small cell lung cancer (NSCLC) represents 85% of all lung cancers. Understanding the molecular events in non-small cell lung cancer (NSCLC) is essential to improve early diagnosis and treatment for this disease. METHODOLOGY AND PRINCIPAL FINDINGS: In an attempt to identify novel NSCLC related genes, we performed a genome-wide screening of chromosomal copy number changes affecting gene expression using microarray based comparative genomic hybridization and gene expression arrays on 32 radically resected tumor samples from stage I and II NSCLC patients. An integrative analysis tool was applied to determine whether chromosomal copy number affects gene expression. We identified a deletion on 14q32.2-33 as a common alteration in NSCLC (44%), which significantly influenced gene expression for HSP90, residing on 14q32. This deletion was correlated with better overall survival (P = 0.008), survival was also longer in patients whose tumors had low expression levels of HSP90. We extended the analysis to three independent validation sets of NSCLC patients, and confirmed low HSP90 expression to be related with longer overall survival (P = 0.003, P = 0.07 and P = 0.04). Furthermore, in vitro treatment with an HSP90 inhibitor had potent antiproliferative activity in NSCLC cell lines. CONCLUSIONS: We suggest that targeting HSP90 will have clinical impact for NSCLC patients
Detection of recurrent copy number alterations in the genome: taking among-subject heterogeneity seriously
Se adjunta un fichero pdf con los datos de investigaciĂłn titulado "Supplementary Material for \Detection of Recurrent Copy
Number Alterations in the Genome: taking among-subject
heterogeneity seriously"Background: Alterations in the number of copies of genomic DNA that are common or recurrent
among diseased individuals are likely to contain disease-critical genes. Unfortunately, defining
common or recurrent copy number alteration (CNA) regions remains a challenge. Moreover, the
heterogeneous nature of many diseases requires that we search for common or recurrent CNA
regions that affect only some subsets of the samples (without knowledge of the regions and subsets
affected), but this is neglected by most methods.
Results: We have developed two methods to define recurrent CNA regions from aCGH data.
Our methods are unique and qualitatively different from existing approaches: they detect regions
over both the complete set of arrays and alterations that are common only to some subsets of the
samples (i.e., alterations that might characterize previously unknown groups); they use probabilities
of alteration as input and return probabilities of being a common region, thus allowing researchers
to modify thresholds as needed; the two parameters of the methods have an immediate,
straightforward, biological interpretation. Using data from previous studies, we show that we can
detect patterns that other methods miss and that researchers can modify, as needed, thresholds of
immediate interpretability and develop custom statistics to answer specific research questions.
Conclusion: These methods represent a qualitative advance in the location of recurrent CNA
regions, highlight the relevance of population heterogeneity for definitions of recurrence, and can
facilitate the clustering of samples with respect to patterns of CNA. Ultimately, the methods
developed can become important tools in the search for genomic regions harboring disease-critical
genesFunding provided by FundaciĂłn de InvestigaciĂłn MĂ©dica Mutua
Madrileña. Publication charges covered by projects CONSOLIDER:
CSD2007-00050 of the Spanish Ministry of Science and Innovation and by
RTIC COMBIOMED RD07/0067/0014 of the Spanish Health Ministr
Breast tumors from CHEK2 1100delC-mutation carriers: genomic landscape and clinical implications
Introduction: Checkpoint kinase 2 (CHEK2) is a moderate penetrance breast cancer risk gene, whose truncating mutation 1100delC increases the risk about twofold. We investigated gene copy-number aberrations and gene-expression profiles that are typical for breast tumors of CHEK2 1100delC-mutation carriers. Methods: In total, 126 breast tumor tissue specimens including 32 samples from patients carrying CHEK2 1100delC were studied in array-comparative genomic hybridization (aCGH) and gene-expression (GEX) experiments. After dimensionality reduction with CGHregions R package, CHEK2 1100delC-associated regions in the aCGH data were detected by the Wilcoxon rank-sum test. The linear model was fitted to GEX data with R package limma. Genes whose expression levels were associated with CHEK2 1100delC mutation were detected by the bayesian method. Results: We discovered four lost and three gained CHEK2 1100delC-related loci. These include losses of 1p13.3-31.3, 8p21.1-2, 8p23.1-2, and 17p12-13.1 as well as gains of 12q13.11-3, 16p13.3, and 19p13.3. Twenty-eight genes located on these regions showed differential expression between CHEK2 1100delC and other tumors, nominating them as candidates for CHEK2 1100delC-associated tumor-progression drivers. These included CLCA1 on 1p22 as well as CALCOCO1, SBEM, and LRP1 on 12q13. Altogether, 188 genes were differentially expressed between CHEK2 1100delC and other tumors. Of these, 144 had elevated and 44, reduced expression levels. Our results suggest the WNT pathway as a driver of tumorigenesis in breast tumors of CHEK2 1100delC-mutation carriers and a role for the olfactory receptor protein family in cancer progression. Differences in the expression of the 188 CHEK2 1100delC-associated genes divided breast tumor samples from three independent datasets into two groups that differed in their relapse-free survival time. Conclusions: We have shown that copy-number aberrations of certain genomic regions are associated with CHEK2 mutation 1100delC. On these regions, we identified potential drivers of CHEK2 1100delC-associated tumorigenesis, whose role in cancer progression is worth investigating. Furthermore, poorer survival related to the CHEK2 1100delC gene-expression signature highlights pathways that are likely to have a role in the development of metastatic disease in carriers of the CHEK2 1100delC mutation
Pathway-based identification of SNPs predictive of survival
In recent years, several association analysis methods for case-control studies have been developed. However, as we turn towards the identification of single nucleotide polymorphisms (SNPs) for prognosis, there is a need to develop methods for the identification of SNPs in high dimensional data with survival outcomes. Traditional methods for the identification of SNPs have some drawbacks. First, the majority of the approaches for case-control studies are based on single SNPs. Second, SNPs that are identified without incorporating biological knowledge are more difficult to interpret. Random forests has been found to perform well in gene expression analysis with survival outcomes. In this paper we present the first pathway-based method to correlate SNP with survival outcomes using a machine learning algorithm. We illustrate the application of pathway-based analysis of SNPs predictive of survival with a data set of 192 multiple myeloma patients genotyped for 500â000 SNPs. We also present simulation studies that show that the random forests technique with log-rank score split criterion outperforms several other machine learning algorithms. Thus, pathway-based survival analysis using machine learning tools represents a promising approach for the identification of biologically meaningful SNPs associated with disease