3 research outputs found
Pan-cancer analysis of whole genomes
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe
Recommended from our members
Unsupervised anomaly detection in daily wan traffic patterns
Growth in large-scale experiments using high capacity reliable networking as part of their design is creating a need for better monitoring and analysis of observed traffic. Network providers need intelligent solutions that can help quickly identify and understand anomalous behaviors at the network edge, allowing reactions to unexpected traffic or attacks on facilities and their peerings. However, due to lack of labeled data in network traffic analysis and user diversity, we introduce novel methods that process very large network datasets quickly for outlier identification. In this paper, we leverage artificial intelligence (AI), network research, and edge computing to collect and train unsupervised classification algorithms using streaming data pipelines from multiple months of network flow records. Once trained, individual classifiers quickly observe and flag alerts in hourly behaviors. Our work describes building the data pipeline as well as addressing issues of false positives and workflow integration
Recommended from our members
Machine learning-based analysis of COVID-19 pandemic impact on US research networks
This study explores how fallout from the changing public health policy around COVID-19 has changed how researchers access and process their science experiments. Using a combination of techniques from statistical analysis and machine learning, we conduct a retrospective analysis of historical network data for a period around the stay-At-home orders that took place in March 2020. Our analysis takes data from the entire ESnet infrastructure to explore DOE high-performance computing (HPC) resources at OLCF, ALCF, and NERSC, as well as User sites such as PNNL and JLAB. We look at detecting and quantifying changes in site activity using a combination of t-Distributed Stochastic Neighbor Embedding (t-SNE) and decision tree analysis. Our findings bring insights into the working patterns and impact on data volume movements, particularly during late-night hours and weekends