101 research outputs found
Dual concepts of almost distance-regularity and the spectral excess theorem
Generally speaking, `almost distance-regular' graphs share some, but not
necessarily all, of the regularity properties that characterize
distance-regular graphs. In this paper we propose two new dual concepts of
almost distance-regularity, thus giving a better understanding of the
properties of distance-regular graphs. More precisely, we characterize
-partially distance-regular graphs and -punctually eigenspace
distance-regular graphs by using their spectra. Our results can also be seen as
a generalization of the so-called spectral excess theorem for distance-regular
graphs, and they lead to a dual version of it
On almost distance-regular graphs
Distance-regular graphs are a key concept in Algebraic Combinatorics and have
given rise to several generalizations, such as association schemes. Motivated
by spectral and other algebraic characterizations of distance-regular graphs,
we study `almost distance-regular graphs'. We use this name informally for
graphs that share some regularity properties that are related to distance in
the graph. For example, a known characterization of a distance-regular graph is
the invariance of the number of walks of given length between vertices at a
given distance, while a graph is called walk-regular if the number of closed
walks of given length rooted at any given vertex is a constant. One of the
concepts studied here is a generalization of both distance-regularity and
walk-regularity called -walk-regularity. Another studied concept is that of
-partial distance-regularity or, informally, distance-regularity up to
distance . Using eigenvalues of graphs and the predistance polynomials, we
discuss and relate these and other concepts of almost distance-regularity, such
as their common generalization of -walk-regularity. We introduce the
concepts of punctual distance-regularity and punctual walk-regularity as a
fundament upon which almost distance-regular graphs are built. We provide
examples that are mostly taken from the Foster census, a collection of
symmetric cubic graphs. Two problems are posed that are related to the question
of when almost distance-regular becomes whole distance-regular. We also give
several characterizations of punctually distance-regular graphs that are
generalizations of the spectral excess theorem
An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics
For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types
GLOWORM-PARA: a flexible framework to simulate the population dynamics of the parasitic phase of gastrointestinal nematodes infecting grazing livestock
Gastrointestinal (GI) nematodes are a significant threat to the economic and environmental sustainability of keeping livestock, as adequate control becomes increasingly difficult due to the development of anthelmintic resistance (AR) in some systems and climate-driven changes to infection dynamics. To mitigate any negative impacts of climate on GI nematode epidemiology and slow AR development, there is a need to develop effective, targeted control strategies that minimise the unnecessary use of anthelmintic drugs and incorporate alternative strategies such as vaccination and evasive grazing. However, the impacts climate and GI nematode epidemiology may have on the optimal control strategy are generally not considered, due to lack of available evidence to drive recommendations. Parasite transmission models can support control strategy evaluation to target field trials, thus reducing the resources and lead-time required to develop evidence-based control recommendations incorporating climate stochasticity. GI nematode population dynamics arising from natural infections have been difficult to replicate and model applications have often focussed on the free-living stages. A flexible framework is presented for the parasitic phase of GI nematodes, GLOWORM-PARA, which complements an existing model of the free-living stages, GLOWORM-FL. Longitudinal parasitological data for two species that are of major economic importance in cattle, Ostertagia ostertagi and Cooperia oncophora, were obtained from seven cattle farms in Belgium for model validation. The framework replicated the observed seasonal dynamics of infection in cattle on these farms and overall, there was no evidence of systematic under- or over-prediction of faecal egg counts (FECs). However, the model under-predicted the FECs observed on one farm with very young calves, highlighting potential areas of uncertainty that may need further investigation if the model is to be applied to young livestock. The model could be used to drive further research into alternative parasite control strategies such as vaccine development and novel treatment approaches, and to understand GI nematode epidemiology under changing climate and host management
Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.
Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy
Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas
Precision oncology uses genomic evidence to match patients with treatment but often fails to identify all patients who may respond. The transcriptome of these \u201chidden responders\u201d may reveal responsive molecular states. We describe and evaluate a machine-learning approach to classify aberrant pathway activity in tumors, which may aid in hidden responder identification. The algorithm integrates RNA-seq, copy number, and mutations from 33 different cancer types across The Cancer Genome Atlas (TCGA) PanCanAtlas project to predict aberrant molecular states in tumors. Applied to the Ras pathway, the method detects Ras activation across cancer types and identifies phenocopying variants. The model, trained on human tumors, can predict response to MEK inhibitors in wild-type Ras cell lines. We also present data that suggest that multiple hits in the Ras pathway confer increased Ras activity. The transcriptome is underused in precision oncology and, combined with machine learning, can aid in the identification of hidden responders. Way et al. develop a machine-learning approach using PanCanAtlas data to detect Ras activation in cancer. Integrating mutation, copy number, and expression data, the authors show that their method detects Ras-activating variants in tumors and sensitivity to MEK inhibitors in cell lines
The Cancer Genome Atlas Comprehensive Molecular Characterization of Renal Cell Carcinoma
Renal cell carcinoma(RCC) is not a single disease, but several histologically defined cancers with different genetic drivers, clinical courses, and therapeutic responses. The current study evaluated 843 RCC from the three major histologic subtypes, including 488 clear cell RCC, 274 papillary RCC, and 81 chromophobe RCC. Comprehensive genomic and phenotypic analysis of the RCC subtypes reveals distinctive features of each subtype that provide the foundation for the development of subtype-specific therapeutic and management strategies for patients affected with these cancers. Somatic alteration of BAP1, PBRM1, and PTEN and altered metabolic pathways correlated with subtype-specific decreased survival, while CDKN2A alteration, increased DNA hypermethylation, and increases in the immune-related Th2 gene expression signature correlated with decreased survival within all major histologic subtypes. CIMP-RCC demonstrated an increased immune signature, and a uniform and distinct metabolic expression pattern identified a subset of metabolically divergent (MD) ChRCC that associated with extremely poor survival
lncRNA Epigenetic Landscape Analysis Identifies EPIC1 as an Oncogenic lncRNA that Interacts with MYC and Promotes Cell-Cycle Progression in Cancer
We characterized the epigenetic landscape of genes encoding long noncoding RNAs (lncRNAs) across 6,475 tumors and 455 cancer cell lines. In stark contrast to the CpG island hypermethylation phenotype in cancer, we observed a recurrent hypomethylation of 1,006 lncRNA genes in cancer, including EPIC1 (epigenetically-induced lncRNA1). Overexpression of EPIC1 is associated with poor prognosis in luminal B breast cancer patients and enhances tumor growth in vitro and in vivo. Mechanistically, EPIC1 promotes cell-cycle progression by interacting with MYC through EPIC1's 129\u2013283 nt region. EPIC1 knockdown reduces the occupancy of MYC to its target genes (e.g., CDKN1A, CCNA2, CDC20, and CDC45). MYC depletion abolishes EPIC1's regulation of MYC target and luminal breast cancer tumorigenesis in vitro and in vivo. Wang et al. characterize the epigenetic landscape of lncRNAs genes across a large number of human tumors and cancer cell lines and observe recurrent hypomethylation of lncRNA genes, including EPIC1. EPIC1 RNA promotes cell-cycle progression by interacting with MYC and enhancing its binding to target genes
Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines
The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects
Somatic Mutational Landscape of Splicing Factor Genes and Their Functional Consequences across 33 Cancer Types
Hotspot mutations in splicing factor genes have been recently reported at high frequency in hematological malignancies, suggesting the importance of RNA splicing in cancer. We analyzed whole-exome sequencing data across 33 tumor types in The Cancer Genome Atlas (TCGA), and we identified 119 splicing factor genes with significant non-silent mutation patterns, including mutation over-representation, recurrent loss of function (tumor suppressor-like), or hotspot mutation profile (oncogene-like). Furthermore, RNA sequencing analysis revealed altered splicing events associated with selected splicing factor mutations. In addition, we were able to identify common gene pathway profiles associated with the presence of these mutations. Our analysis suggests that somatic alteration of genes involved in the RNA-splicing process is common in cancer and may represent an underappreciated hallmark of tumorigenesis
- …