45 research outputs found

    Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma

    Get PDF
    A variety of mutational processes drive cancer development, but their dynamics across the entire disease spectrum from pre-cancerous to advanced neoplasia are poorly understood. We explore the mutagenic processes shaping oesophageal adenocarcinoma tumorigenesis in 997 instances comprising distinct stages of this malignancy, from Barrett Oesophagus to primary tumours and advanced metastatic disease. The mutational landscape is dominated by the C[T &gt; C/G]T substitution enriched signatures SBS17a/b, which are linked with TP53 mutations, increased proliferation, genomic instability and disease progression. The APOBEC mutagenesis signature is a weak but persistent signal amplified in primary tumours. We also identify prevalent alterations in DNA damage repair pathways, with homologous recombination, base and nucleotide excision repair and translesion synthesis mutated in up to 50% of the cohort, and surprisingly uncoupled from transcriptional activity. Among these, the presence of base excision repair deficiencies show remarkably poor prognosis in the cohort. In this work, we provide insights on the mutational aetiology and changes enabling the transition from pre-neoplastic to advanced oesophageal adenocarcinoma.</p

    Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma

    Get PDF
    A variety of mutational processes drive cancer development, but their dynamics across the entire disease spectrum from pre-cancerous to advanced neoplasia are poorly understood. We explore the mutagenic processes shaping oesophageal adenocarcinoma tumorigenesis in 997 instances comprising distinct stages of this malignancy, from Barrett Oesophagus to primary tumours and advanced metastatic disease. The mutational landscape is dominated by the C[T > C/G]T substitution enriched signatures SBS17a/b, which are linked with TP53 mutations, increased proliferation, genomic instability and disease progression. The APOBEC mutagenesis signature is a weak but persistent signal amplified in primary tumours. We also identify prevalent alterations in DNA damage repair pathways, with homologous recombination, base and nucleotide excision repair and translesion synthesis mutated in up to 50% of the cohort, and surprisingly uncoupled from transcriptional activity. Among these, the presence of base excision repair deficiencies show remarkably poor prognosis in the cohort. In this work, we provide insights on the mutational aetiology and changes enabling the transition from pre-neoplastic to advanced oesophageal adenocarcinoma

    Rearrangement processes and structural variations show evidence of selection in oesophageal adenocarcinomas

    Get PDF
    Oesophageal adenocarcinoma (OAC) provides an ideal case study to characterize large-scale rearrangements. Using whole genome short-read sequencing of 383 cases, for which 214 had matched whole transcriptomes, we observed structural variations (SV) with a predominance of deletions, tandem duplications and inter-chromosome junctions that could be identified as LINE-1 mobile element (ME) insertions. Complex clusters of rearrangements resembling breakage-fusion-bridge cycles or extrachromosomal circular DNA accounted for 22% of complex SVs affecting known oncogenes. Counting SV events affecting known driver genes substantially increased the recurrence rates of these drivers. After excluding fragile sites, we identified 51 candidate new drivers in genomic regions disrupted by SVs, including ETV5, KAT6B and CLTC. RUNX1 was the most recurrently altered gene (24%), with many deletions inactivating the RUNT domain but preserved the reading frame, suggesting an altered protein product. These findings underscore the importance of identification of SV events in OAC with implications for targeted therapies.</p

    Rearrangement processes and structural variations show evidence of selection in oesophageal adenocarcinomas

    Get PDF
    Oesophageal adenocarcinoma (OAC) provides an ideal case study to characterize large-scale rearrangements. Using whole genome short-read sequencing of 383 cases, for which 214 had matched whole transcriptomes, we observed structural variations (SV) with a predominance of deletions, tandem duplications and inter-chromosome junctions that could be identified as LINE-1 mobile element (ME) insertions. Complex clusters of rearrangements resembling breakage-fusion-bridge cycles or extrachromosomal circular DNA accounted for 22% of complex SVs affecting known oncogenes. Counting SV events affecting known driver genes substantially increased the recurrence rates of these drivers. After excluding fragile sites, we identified 51 candidate new drivers in genomic regions disrupted by SVs, including ETV5, KAT6B and CLTC. RUNX1 was the most recurrently altered gene (24%), with many deletions inactivating the RUNT domain but preserved the reading frame, suggesting an altered protein product. These findings underscore the importance of identification of SV events in OAC with implications for targeted therapies.</p

    SAMQA: error classification and validation of high-throughput sequenced read data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advances in high-throughput sequencing technologies and growth in data sizes has highlighted the need for scalable tools to perform quality assurance testing. These tests are necessary to ensure that data is of a minimum necessary standard for use in downstream analysis. In this paper we present the SAMQA tool to rapidly and robustly identify errors in population-scale sequence data.</p> <p>Results</p> <p>SAMQA has been used on samples from three separate sets of cancer genome data from The Cancer Genome Atlas (TCGA) project. Using technical standards provided by the SAM specification and biological standards defined by researchers, we have classified errors in these sequence data sets relative to individual reads within a sample. Due to an observed linearithmic speedup through the use of a high-performance computing (HPC) framework for the majority of tasks, poor quality data was identified prior to secondary analysis in significantly less time on the HPC framework than the same data run using alternative parallelization strategies on a single server.</p> <p>Conclusions</p> <p>The SAMQA toolset validates a minimum set of data quality standards across whole-genome and exome sequences. It is tuned to run on a high-performance computational framework, enabling QA across hundreds gigabytes of samples regardless of coverage or sample type.</p

    Identification of prognostic phenotypes of esophageal adenocarcinoma in two independent cohorts.

    Get PDF
    BACKGROUND & AIMS: Most patients with esophageal adenocarcinoma (EAC) present de novo. Although this may be due to inadequate screening strategies, the precise reason for this observation is not clear.. We compared survival of patients with prevalent EAC with and without synchronous BE/intestinal metaplasia of the esophagus (IM) at the time of EAC diagnosis. METHODS: Clinical data were studied using Cox Proportional Hazards regression to evaluate the effect of synchronous BE/IM on EAC survival independent of age, sex, TNM stage and tumor location. Two cohorts from the Mayo Clinic and a U.K. multicenter prospective cohort were included. RESULTS: The Mayo cohort had 411 EAC patients with 49.3% with BE/IM demonstrating a survival benefit as compared to those without (hazard ratio (HR), 0.44; 95% CI: 0.34 - 0.57, P<0.001). In a multivariable analysis BE/IM was associated with better survival independent of age, sex, stage and tumor location and length (adjusted HR: 0.66, 95% CI: 0.5-0.88, P=0.005). The UK cohort contained 1417 patients, 45% with BE/IM demonstrating a survival benefit as compared with non-BE/IM patients (HR 0.59, 95% CI: 0.5-0.69, P<0.001) with continued significance in multivariable analysis that included age, sex, stage, and tumor location (adjusted HR 0.77, 95% CI: 0.64-0.93, P=0.006). CONCLUSION: Two types of esophageal adenocarcinoma can be characterized based on the presence or absence of Barrett's epithelium. These findings have implications for understanding the etiology of EAC and determining prognosis as well as for development of optimal clinical strategies to identify patients at risk

    Adaptable data management for systems biology investigations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Within research each experiment is different, the focus changes and the data is generated from a continually evolving barrage of technologies. There is a continual introduction of new techniques whose usage ranges from in-house protocols through to high-throughput instrumentation. To support these requirements data management systems are needed that can be rapidly built and readily adapted for new usage.</p> <p>Results</p> <p>The adaptable data management system discussed is designed to support the seamless mining and analysis of biological experiment data that is commonly used in systems biology (e.g. ChIP-chip, gene expression, proteomics, imaging, flow cytometry). We use different content graphs to represent different views upon the data. These views are designed for different roles: equipment specific views are used to gather instrumentation information; data processing oriented views are provided to enable the rapid development of analysis applications; and research project specific views are used to organize information for individual research experiments. This management system allows for both the rapid introduction of new types of information and the evolution of the knowledge it represents.</p> <p>Conclusion</p> <p>Data management is an important aspect of any research enterprise. It is the foundation on which most applications are built, and must be easily extended to serve new functionality for new scientific areas. We have found that adopting a three-tier architecture for data management, built around distributed standardized content repositories, allows us to rapidly develop new applications to support a diverse user community.</p

    Systems biology driven software design for the research enterprise

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In systems biology, and many other areas of research, there is a need for the interoperability of tools and data sources that were not originally designed to be integrated. Due to the interdisciplinary nature of systems biology, and its association with high throughput experimental platforms, there is an additional need to continually integrate new technologies. As scientists work in isolated groups, integration with other groups is rarely a consideration when building the required software tools.</p> <p>Results</p> <p>We illustrate an approach, through the discussion of a purpose built software architecture, which allows disparate groups to reuse tools and access data sources in a common manner. The architecture allows for: the rapid development of distributed applications; interoperability, so it can be used by a wide variety of developers and computational biologists; development using standard tools, so that it is easy to maintain and does not require a large development effort; extensibility, so that new technologies and data types can be incorporated; and non intrusive development, insofar as researchers need not to adhere to a pre-existing object model.</p> <p>Conclusion</p> <p>By using a relatively simple integration strategy, based upon a common identity system and dynamically discovered interoperable services, a light-weight software architecture can become the focal point through which scientists can both get access to and analyse the plethora of experimentally derived data.</p

    SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires.</p> <p>Results</p> <p>Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code.</p> <p>Conclusion</p> <p>The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services.</p
    corecore