73 research outputs found

    Intrinsic bias in breast cancer gene expression data sets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>While global breast cancer gene expression data sets have considerable commonality in terms of their data content, the populations that they represent and the data collection methods utilized can be quite disparate. We sought to assess the extent and consequence of these systematic differences with respect to identifying clinically significant prognostic groups.</p> <p>Methods</p> <p>We ascertained how effectively unsupervised clustering employing randomly generated sets of genes could segregate tumors into prognostic groups using four well-characterized breast cancer data sets.</p> <p>Results</p> <p>Using a common set of 5,000 randomly generated lists (70 genes/list), the percentages of clusters with significant differences in metastasis latencies (HR p-value < 0.01) was 62%, 15%, 21% and 0% in the NKI2 (Netherlands Cancer Institute), Wang, TRANSBIG and KJX64/KJ125 data sets, respectively. Among ER positive tumors, the percentages were 38%, 11%, 4% and 0%, respectively. Few random lists were predictive among ER negative tumors in any data set. Clustering was associated with ER status and, after globally adjusting for the effects of ER-α gene expression, the percentages were 25%, 33%, 1% and 0%, respectively. The impact of adjusting for ER status depended on the extent of confounding between ER-α gene expression and markers of proliferation.</p> <p>Conclusion</p> <p>It is highly probable to identify a statistically significant association between a given gene list and prognosis in the NKI2 dataset due to its large sample size and the interrelationship between ER-α expression and markers of proliferation. In most respects, the TRANSBIG data set generated similar outcomes as the NKI2 data set, although its smaller sample size led to fewer statistically significant results.</p

    International Agency for Research on Cancer Workshop on 'Expression array analyses in breast cancer taxonomy'

    Get PDF
    In May 2006, a workshop on Expression array analyses in breast cancer taxonomy was held at the International Agency for Research on Cancer (IARC). The workshop covered an array of topics from the validity of the currently defined breast tumor subtypes and other expression profile-based signatures to the technical limitations of expression analysis and the types of platforms on which these omics results will eventually reach clinical practice. Overall, the workshop participants believed firmly that tumor taxonomy is likely to yield improved prognostic and predictive markers. Even so, further standardization and validation are required before clinical trials are set in motion

    Open Data for Global Science

    Get PDF
    The global science system stands at a critical juncture. On the one hand, it is overwhelmed by a hidden avalanche of ephemeral bits that are central components of modern research and of the emerging ‘cyberinfrastructure’4 for e-Science.5 The rational management and exploitation of this cascade of digital assets offers boundless opportunities for research and applications. On the other hand, the ability to access and use this rising flood of data seems to lag behind, despite the rapidly growing capabilities of information and communication technologies (ICTs) to make much more effective use of those data. As long as the attention for data policies and data management by researchers, their organisations and their funders does not catch up with the rapidly changing research environment, the research policy and funding entities in many cases will perpetuate the systemic inefficiencies, and the resulting loss or underutilisation of valuable data resources derived from public investments. There is thus an urgent need for rationalised national strategies and more coherent international arrangements for sustainable access to public research data, both to data produced directly by government entities and to data generated in academic and not-for-profit institutions with public funding. In this chapter, we examine some of the implications of the ‘data driven’ research and possible ways to overcome existing barriers to accessibility of public research data. Our perspective is framed in the context of the predominantly publicly funded global science system. We begin by reviewing the growing role of digital data in research and outlining the roles of stakeholders in the research community in developing data access regimes. We then discuss the hidden costs of closed data systems, the benefits and limitations of openness as the default principle for data access, and the emerging open access models that are beginning to form digitally networked commons. We conclude by examining the rationale and requirements for developing overarching international principles from the top down, as well as flexible, common-use contractual templates from the bottom up, to establish data access regimes founded on a presumption of openness, with the goal of better capturing the benefits from the existing and future scientific data assets. The ‘Principles and Guidelines for Access to Research Data from Public Funding’ from the Organisation for Economic Cooperation and Development (OECD), reported on in another article by Pilat and Fukasaku,6 are the most important recent example of the high-level (inter)governmental approach. The common-use licenses promoted by the Science Commons are a leading example of flexible arrangements originating within the community. Finally, we should emphasise that we focus almost exclusively on the policy—the institutional, socioeconomic, and legal aspects of data access—rather than on the technical and management practicalities that are also important, but beyond the scope of this article

    Breast tumors from CHEK2 1100delC-mutation carriers: genomic landscape and clinical implications

    Get PDF
    Introduction: Checkpoint kinase 2 (CHEK2) is a moderate penetrance breast cancer risk gene, whose truncating mutation 1100delC increases the risk about twofold. We investigated gene copy-number aberrations and gene-expression profiles that are typical for breast tumors of CHEK2 1100delC-mutation carriers. Methods: In total, 126 breast tumor tissue specimens including 32 samples from patients carrying CHEK2 1100delC were studied in array-comparative genomic hybridization (aCGH) and gene-expression (GEX) experiments. After dimensionality reduction with CGHregions R package, CHEK2 1100delC-associated regions in the aCGH data were detected by the Wilcoxon rank-sum test. The linear model was fitted to GEX data with R package limma. Genes whose expression levels were associated with CHEK2 1100delC mutation were detected by the bayesian method. Results: We discovered four lost and three gained CHEK2 1100delC-related loci. These include losses of 1p13.3-31.3, 8p21.1-2, 8p23.1-2, and 17p12-13.1 as well as gains of 12q13.11-3, 16p13.3, and 19p13.3. Twenty-eight genes located on these regions showed differential expression between CHEK2 1100delC and other tumors, nominating them as candidates for CHEK2 1100delC-associated tumor-progression drivers. These included CLCA1 on 1p22 as well as CALCOCO1, SBEM, and LRP1 on 12q13. Altogether, 188 genes were differentially expressed between CHEK2 1100delC and other tumors. Of these, 144 had elevated and 44, reduced expression levels. Our results suggest the WNT pathway as a driver of tumorigenesis in breast tumors of CHEK2 1100delC-mutation carriers and a role for the olfactory receptor protein family in cancer progression. Differences in the expression of the 188 CHEK2 1100delC-associated genes divided breast tumor samples from three independent datasets into two groups that differed in their relapse-free survival time. Conclusions: We have shown that copy-number aberrations of certain genomic regions are associated with CHEK2 mutation 1100delC. On these regions, we identified potential drivers of CHEK2 1100delC-associated tumorigenesis, whose role in cancer progression is worth investigating. Furthermore, poorer survival related to the CHEK2 1100delC gene-expression signature highlights pathways that are likely to have a role in the development of metastatic disease in carriers of the CHEK2 1100delC mutation

    The “conscious pilot”—dendritic synchrony moves through the brain to mediate consciousness

    Get PDF
    Cognitive brain functions including sensory processing and control of behavior are understood as “neurocomputation” in axonal–dendritic synaptic networks of “integrate-and-fire” neurons. Cognitive neurocomputation with consciousness is accompanied by 30- to 90-Hz gamma synchrony electroencephalography (EEG), and non-conscious neurocomputation is not. Gamma synchrony EEG derives largely from neuronal groups linked by dendritic–dendritic gap junctions, forming transient syncytia (“dendritic webs”) in input/integration layers oriented sideways to axonal–dendritic neurocomputational flow. As gap junctions open and close, a gamma-synchronized dendritic web can rapidly change topology and move through the brain as a spatiotemporal envelope performing collective integration and volitional choices correlating with consciousness. The “conscious pilot” is a metaphorical description for a mobile gamma-synchronized dendritic web as vehicle for a conscious agent/pilot which experiences and assumes control of otherwise non-conscious auto-pilot neurocomputation

    Pan-cancer analysis of whole genomes

    Get PDF
    Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe
    corecore