19 research outputs found

    Beyond attention: deriving biologically interpretable insights from weakly-supervised multiple-instance learning models

    Full text link
    Recent advances in attention-based multiple instance learning (MIL) have improved our insights into the tissue regions that models rely on to make predictions in digital pathology. However, the interpretability of these approaches is still limited. In particular, they do not report whether high-attention regions are positively or negatively associated with the class labels or how well these regions correspond to previously established clinical and biological knowledge. We address this by introducing a post-training methodology to analyse MIL models. Firstly, we introduce prediction-attention-weighted (PAW) maps by combining tile-level attention and prediction scores produced by a refined encoder, allowing us to quantify the predictive contribution of high-attention regions. Secondly, we introduce a biological feature instantiation technique by integrating PAW maps with nuclei segmentation masks. This further improves interpretability by providing biologically meaningful features related to the cellular organisation of the tissue and facilitates comparisons with known clinical features. We illustrate the utility of our approach by comparing PAW maps obtained for prostate cancer diagnosis (i.e. samples containing malignant tissue, 381/516 tissue samples) and prognosis (i.e. samples from patients with biochemical recurrence following surgery, 98/663 tissue samples) in a cohort of patients from the international cancer genome consortium (ICGC UK Prostate Group). Our approach reveals that regions that are predictive of adverse prognosis do not tend to co-locate with the tumour regions, indicating that non-cancer cells should also be studied when evaluating prognosis

    HES5 silencing is an early and recurrent change in prostate tumourigenesis.

    Get PDF
    Prostate cancer is the most common cancer in men, resulting in over 10 000 deaths/year in the UK. Sequencing and copy number analysis of primary tumours has revealed heterogeneity within tumours and an absence of recurrent founder mutations, consistent with non-genetic disease initiating events. Using methylation profiling in a series of multi-focal prostate tumours, we identify promoter methylation of the transcription factor HES5 as an early event in prostate tumourigenesis. We confirm that this epigenetic alteration occurs in 86-97% of cases in two independent prostate cancer cohorts (n=49 and n=39 tumour-normal pairs). Treatment of prostate cancer cells with the demethylating agent 5-aza-2'-deoxycytidine increased HES5 expression and downregulated its transcriptional target HES6, consistent with functional silencing of the HES5 gene in prostate cancer. Finally, we identify and test a transcriptional module involving the AR, ERG, HES1 and HES6 and propose a model for the impact of HES5 silencing on tumourigenesis as a starting point for future functional studies.The authors are grateful to study volunteers for their participation and staff at the Welcome Trust Clinical Research Facility, Addenbrooke’s Clinical Research Centre, Cambridge. They also thank the NIHR Cambridge Biomedical Research Centre, the DOH HTA (ProtecT grant), and the NCRI/MRC (ProMPT grant) for help with the bio-repository, The University of Cambridge, Hutchison Whampoa Limited and Cancer Research UK for funding. They are grateful to the CRUK Cambridge Institute Genomics and Bioinformatics Core Facilities. Cross-validation of HES5 methylation includes the use of data generated by the TCGA Research Network.This is the final version of the article. It was originally published in the Endocrine-Related Cancer, April 1, 2015 22 131-144 doi: 10.1530/ERC-14-0454

    Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells

    Get PDF
    Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells.This work was supported by the Wellcome Trust. Y.S.J is supported by a European Molecular Biology Organization long-term fellowship (LTF 1203_2012). J.M.C.T. is supported by Marie Curie Fellowship FP7 PEOPLE-2012-IEF (project number 328264). P.J.C. is a Wellcome Trust Senior Clinical Fellow. Support was provided to A.M.F. by the National Institute for Health Research (NIHR) UCLH Biomedical Research Centre. The ICGC Breast Cancer Consortium was supported by a grant from the European Union (BASIS) and the Wellcome Trust. The ICGC Prostate Cancer Consortium was funded by Cancer Research UK with a grant from the Dallaglio Foundation (grant number C5047/A14835). R.E. is supported by National Institute for Health Research support to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden NHS Foundation Trust. We also thank the National Cancer Research Prostate Cancer Mechanisms of Progression and Treatment (PROMPT) collaborative (grant code G0500966/75466) which has funded tissue and urine collections in Cambridge. The authors also acknowledge financial support from the Department of Health via the National Institute for Health Research comprehensive Biomedical Research Centre award to Guy’s and St. Thomas’ NHS Foundation Trust and Breakthrough Breast Cancer Research (ICGC 08/09 and KCL) (A.T.)

    Genetic Diversity in Wheat: Analysis using Diversity Arrays Technology (DArT) in bread and durum wheats

    Get PDF
    With increasing demands on the quality and quantity of food required now and in the future, improvements to current agriculture practices are required. Increased food production requires utilisation of more agricultural land, pushing crops into non- traditional areas. The need for advances in agricultural technologies are not only required for current crop varieties, but for new varieties with increased tolerance to environmental stresses. Technological improvement means better crop yields and reduced land, water, fertilizer and pesticide use. Diversity Arrays Technology (DArT) was used to study wheat diversity, specifically to identify polymorphic markers between various wheat cultivars for use in marker- assisted breeding programs. The hybridisation based technology was used to analyse various bread and durum wheat cultivars for increased understanding of genomic diversity. Analysis shows that DArT is able to discriminate between tissue samples from wheat cultivars grown under various environmental stresses with polymorphic markers identified between samples treated with differing salt, light and temperature conditions. Epigenetic diversity was analysed through methylation detection using DArT to identify a list of candidate polymorphic markers. Markers were identified using the methylation sensitive restriction enzyme McrBC to generate control and treated targets. Diversity through cultivar exploration, looking at breeding experiments between cultivars with phenotypic extremes to examine salt tolerance versus in-tolerance using DArT produced a recombinant inbred line genetic linkage map. Bulk segregant analysis was also used to group phenotypic samples. Candidate markers were identified between cultivars that can be used to genotyping tetraploid and hexaploid wheat cultivars for germplasm identification. In addition, the identification of trait-linked molecular markers, such as salt resistance, plant breeders can genotype individual plants and populations of cultivars to determine the most suitable cultivar to plant that best complements to its local environment. This eliminates the need for multiple planting cycles to optimize crop selections, and gives the plant breeder the highest possible chance for crop success (yield, quality, performance and cost)

    Genomic evolution shapes prostate cancer disease type

    Get PDF
    H.R.F. was supported by a Cancer Research UK Programme Grant to Simon Tavaré (C14303/A17197), as, partially, was A.G.L. A.G.L. acknowledges the support of the University of St Andrews. A.G.L. and J.H.R.F. also acknowledge the support of the Cambridge Cancer Research Fund.The development of cancer is an evolutionary process involving the sequential acquisition of genetic alterations that disrupt normal biological processes, enabling tumor cells to rapidly proliferate and eventually invade and metastasize to other tissues. We investigated the genomic evolution of prostate cancer through the application of three separate classification methods, each designed to investigate a different aspect of tumor evolution. Integrating the results revealed the existence of two distinct types of prostate cancer that arise from divergent evolutionary trajectories, designated as the Canonical and Aalternative evolutionary disease types. We therefore propose the evotype model for prostate cancer evolution wherein Alternative-evotype tumors diverge from those of the Canonical-evotype through the stochastic accumulation of genetic alterations associated with disruptions to androgen receptor DNA binding. Our model unifies many previous molecular observations, providing a powerful new framework to investigate prostate cancer disease progression.Peer reviewe

    The architecture of clonal expansions in morphologically normal tissue from cancerous and non-cancerous prostates

    Get PDF
    Background: Up to 80% of cases of prostate cancer present with multifocal independent tumour lesions leading to the concept of a field effect present in the normal prostate predisposing to cancer development. In the present study we applied Whole Genome DNA Sequencing (WGS) to a group of morphologically normal tissue (n = 51), including benign prostatic hyperplasia (BPH) and non-BPH samples, from men with and men without prostate cancer. We assess whether the observed genetic changes in morphologically normal tissue are linked to the development of cancer in the prostate. Results: Single nucleotide variants (P = 7.0 × 10–03, Wilcoxon rank sum test) and small insertions and deletions (indels, P = 8.7 × 10–06) were significantly higher in morphologically normal samples, including BPH, from men with prostate cancer compared to those without. The presence of subclonal expansions under selective pressure, supported by a high level of mutations, were significantly associated with samples from men with prostate cancer (P = 0.035, Fisher exact test). The clonal cell fraction of normal clones was always higher than the proportion of the prostate estimated as epithelial (P = 5.94 × 10–05, paired Wilcoxon signed rank test) which, along with analysis of primary fibroblasts prepared from BPH specimens, suggests a stromal origin. Constructed phylogenies revealed lineages associated with benign tissue that were completely distinct from adjacent tumour clones, but a common lineage between BPH and non-BPH morphologically normal tissues was often observed. Compared to tumours, normal samples have significantly less single nucleotide variants (P = 3.72 × 10–09, paired Wilcoxon signed rank test), have very few rearrangements and a complete lack of copy number alterations. Conclusions: Cells within regions of morphologically normal tissue (both BPH and non-BPH) can expand under selective pressure by mechanisms that are distinct from those occurring in adjacent cancer, but that are allied to the presence of cancer. Expansions, which are probably stromal in origin, are characterised by lack of recurrent driver mutations, by almost complete absence of structural variants/copy number alterations, and mutational processes similar to malignant tissue. Our findings have implications for treatment (focal therapy) and early detection approaches.publishedVersionPeer reviewe

    Appraising the relevance of DNA copy number loss and gain in prostate cancer using whole genome DNA sequence data.

    Get PDF
    A variety of models have been proposed to explain regions of recurrent somatic copy number alteration (SCNA) in human cancer. Our study employs Whole Genome DNA Sequence (WGS) data from tumor samples (n = 103) to comprehensively assess the role of the Knudson two hit genetic model in SCNA generation in prostate cancer. 64 recurrent regions of loss and gain were detected, of which 28 were novel, including regions of loss with more than 15% frequency at Chr4p15.2-p15.1 (15.53%), Chr6q27 (16.50%) and Chr18q12.3 (17.48%). Comprehensive mutation screens of genes, lincRNA encoding sequences, control regions and conserved domains within SCNAs demonstrated that a two-hit genetic model was supported in only a minor proportion of recurrent SCNA losses examined (15/40). We found that recurrent breakpoints and regions of inversion often occur within Knudson model SCNAs, leading to the identification of ZNF292 as a target gene for the deletion at 6q14.3-q15 and NKX3.1 as a two-hit target at 8p21.3-p21.2. The importance of alterations of lincRNA sequences was illustrated by the identification of a novel mutational hotspot at the KCCAT42, FENDRR, CAT1886 and STCAT2 loci at the 16q23.1-q24.3 loss. Our data confirm that the burden of SCNAs is predictive of biochemical recurrence, define nine individual regions that are associated with relapse, and highlight the possible importance of ion channel and G-protein coupled-receptor (GPCR) pathways in cancer development. We concluded that a two-hit genetic model accounts for about one third of SCNA indicating that mechanisms, such haploinsufficiency and epigenetic inactivation, account for the remaining SCNA losses.We acknowledge support from Cancer Research UK (C5047/A22530, C309/A11566, C368/A6743, A368/A7990, C14303/A17197) and the Dallaglio Foundation. We also acknowledge support from the National Institute of Health Research (NIHR) (The Biomedical Research Centre at The Institute of Cancer Research & The Royal Marsden NHS Foundation Trust and the project "Prostate Cancer: Mechanisms of Progression and Treatment (PROMPT)" [G0500966/75466]). We thank the Wellcome Trust, Bob Champion Cancer Trust, The Orchid Cancer appeal, The RoseTrees Trust, The North West Cancer Research Fund, Big C, The King family, and The Masonic Charitable Foundation for funding. This research is supported by the Francis Crick Institute which receives its core funding from Cancer Research UK (FC001202), the UK Medical Research Council (FC001202), and the Wellcome Trust (FC001202). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
    corecore