32 research outputs found
Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening
High-throughput screening (HTS) campaigns are routinely performed in pharmaceutical companies to explore activity profiles of chemical libraries for the identification of promising candidates for further investigation. With the aim of improving hit rates in these campaigns, data-driven approaches have been used to design relevant compound screening collections, enable effective hit triage and perform activity modeling for compound prioritization. Remarkable progress has been made in the activity modeling area since the recent introduction of large-scale bioactivity-based compound similarity metrics. This is evidenced by increased hit rates in iterative screening strategies and novel insights into compound mode of action obtained through activity modeling. Here, we provide an overview of the developments in data-driven approaches, elaborate on novel activity modeling techniques and screening paradigms explored and outline their significance in HTS.Medicinal Chemistr
Rare genetic variants explain missing heritability in smoking
Common genetic variants explain less variation in complex phenotypes than inferred from family-based studies, and there is a debate on the source of this ‘missing heritability’. We investigated the contribution of rare genetic variants to tobacco use with whole-genome sequences from up to 26,257 unrelated individuals of European ancestries and 11,743 individuals of African ancestries. Across four smoking traits, single-nucleotide-polymorphism-based heritability (hSNP2) was estimated from 0.13 to 0.28 (s.e., 0.10–0.13) in European ancestries, with 35–74% of it attributable to rare variants with minor allele frequencies between 0.01% and 1%. These heritability estimates are 1.5–4 times higher than past estimates based on common variants alone and accounted for 60% to 100% of our pedigree-based estimates of narrow-sense heritability (hped2, 0.18–0.34). In the African ancestry samples, hSNP2 was estimated from 0.03 to 0.33 (s.e., 0.09–0.14) across the four smoking traits. These results suggest that rare variants are important contributors to the heritability of smoking
Multi-ancestry transcriptome-wide association analyses yield insights into tobacco use biology and drug repurposing
Most transcriptome-wide association studies (TWASs) so far focus on European ancestry and lack diversity. To overcome this limitation, we aggregated genome-wide association study (GWAS) summary statistics, whole-genome sequences and expression quantitative trait locus (eQTL) data from diverse ancestries. We developed a new approach, TESLA (multi-ancestry integrative study using an optimal linear combination of association statistics), to integrate an eQTL dataset with a multi-ancestry GWAS. By exploiting shared phenotypic effects between ancestries and accommodating potential effect heterogeneities, TESLA improves power over other TWAS methods. When applied to tobacco use phenotypes, TESLA identified 273 new genes, up to 55% more compared with alternative TWAS methods. These hits and subsequent fine mapping using TESLA point to target genes with biological relevance. In silico drug-repurposing analyses highlight several drugs with known efficacy, including dextromethorphan and galantamine, and new drugs such as muscle relaxants that may be repurposed for treating nicotine addiction
The artificial intelligence-based model ANORAK improves histopathological grading of lung adenocarcinoma
The introduction of the International Association for the Study of Lung Cancer grading system has furthered interest in histopathological grading for risk stratification in lung adenocarcinoma. Complex morphology and high intratumoral heterogeneity present challenges to pathologists, prompting the development of artificial intelligence (AI) methods. Here we developed ANORAK (pyrAmid pooliNg crOss stReam Attention networK), encoding multiresolution inputs with an attention mechanism, to delineate growth patterns from hematoxylin and eosin-stained slides. In 1,372 lung adenocarcinomas across four independent cohorts, AI-based grading was prognostic of disease-free survival, and further assisted pathologists by consistently improving prognostication in stage I tumors. Tumors with discrepant patterns between AI and pathologists had notably higher intratumoral heterogeneity. Furthermore, ANORAK facilitates the morphological and spatial assessment of the acinar pattern, capturing acinus variations with pattern transition. Collectively, our AI method enabled the precision quantification and morphology investigation of growth patterns, reflecting intratumoral histological transitions in lung adenocarcinoma
Evolutionary characterization of lung adenocarcinoma morphology in TRACERx
Lung adenocarcinomas (LUADs) display a broad histological spectrum from low-grade lepidic tumors through to mid-grade acinar and papillary and high-grade solid, cribriform and micropapillary tumors. How morphology reflects tumor evolution and disease progression is poorly understood. Whole-exome sequencing data generated from 805 primary tumor regions and 121 paired metastatic samples across 248 LUADs from the TRACERx 421 cohort, together with RNA-sequencing data from 463 primary tumor regions, were integrated with detailed whole-tumor and regional histopathological analysis. Tumors with predominantly high-grade patterns showed increased chromosomal complexity, with higher burden of loss of heterozygosity and subclonal somatic copy number alterations. Individual regions in predominantly high-grade pattern tumors exhibited higher proliferation and lower clonal diversity, potentially reflecting large recent subclonal expansions. Co-occurrence of truncal loss of chromosomes 3p and 3q was enriched in predominantly low-/mid-grade tumors, while purely undifferentiated solid-pattern tumors had a higher frequency of truncal arm or focal 3q gains and SMARCA4 gene alterations compared with mixed-pattern tumors with a solid component, suggesting distinct evolutionary trajectories. Clonal evolution analysis revealed that tumors tend to evolve toward higher-grade patterns. The presence of micropapillary pattern and ‘tumor spread through air spaces’ were associated with intrathoracic recurrence, in contrast to the presence of solid/cribriform patterns, necrosis and preoperative circulating tumor DNA detection, which were associated with extra-thoracic recurrence. These data provide insights into the relationship between LUAD morphology, the underlying evolutionary genomic landscape, and clinical and anatomical relapse risk
The evolution of lung cancer and impact of subclonal selection in TRACERx
Lung cancer is the leading cause of cancer-associated mortality worldwide1. Here we analysed 1,644 tumour regions sampled at surgery or during follow-up from the first 421 patients with non-small cell lung cancer prospectively enrolled into the TRACERx study. This project aims to decipher lung cancer evolution and address the primary study endpoint: determining the relationship between intratumour heterogeneity and clinical outcome. In lung adenocarcinoma, mutations in 22 out of 40 common cancer genes were under significant subclonal selection, including classical tumour initiators such as TP53 and KRAS. We defined evolutionary dependencies between drivers, mutational processes and whole genome doubling (WGD) events. Despite patients having a history of smoking, 8% of lung adenocarcinomas lacked evidence of tobacco-induced mutagenesis. These tumours also had similar detection rates for EGFR mutations and for RET, ROS1, ALK and MET oncogenic isoforms compared with tumours in never-smokers, which suggests that they have a similar aetiology and pathogenesis. Large subclonal expansions were associated with positive subclonal selection. Patients with tumours harbouring recent subclonal expansions, on the terminus of a phylogenetic branch, had significantly shorter disease-free survival. Subclonal WGD was detected in 19% of tumours, and 10% of tumours harboured multiple subclonal WGDs in parallel. Subclonal, but not truncal, WGD was associated with shorter disease-free survival. Copy number heterogeneity was associated with extrathoracic relapse within 1 year after surgery. These data demonstrate the importance of clonal expansion, WGD and copy number instability in determining the timing and patterns of relapse in non-small cell lung cancer and provide a comprehensive clinical cancer evolutionary data resource
The evolution of non-small cell lung cancer metastases in TRACERx
Metastatic disease is responsible for the majority of cancer-related deaths1. We report the longitudinal evolutionary analysis of 126 non-small cell lung cancer (NSCLC) tumours from 421 prospectively recruited patients in TRACERx who developed metastatic disease, compared with a control cohort of 144 non-metastatic tumours. In 25% of cases, metastases diverged early, before the last clonal sweep in the primary tumour, and early divergence was enriched for patients who were smokers at the time of initial diagnosis. Simulations suggested that early metastatic divergence more frequently occurred at smaller tumour diameters (less than 8 mm). Single-region primary tumour sampling resulted in 83% of late divergence cases being misclassified as early, highlighting the importance of extensive primary tumour sampling. Polyclonal dissemination, which was associated with extrathoracic disease recurrence, was found in 32% of cases. Primary lymph node disease contributed to metastatic relapse in less than 20% of cases, representing a hallmark of metastatic potential rather than a route to subsequent recurrences/disease progression. Metastasis-seeding subclones exhibited subclonal expansions within primary tumours, probably reflecting positive selection. Our findings highlight the importance of selection in metastatic clone evolution within untreated primary tumours, the distinction between monoclonal versus polyclonal seeding in dictating site of recurrence, the limitations of current radiological screening approaches for early diverging tumours and the need to develop strategies to target metastasis-seeding subclones before relapse
Genomic–transcriptomic evolution in lung cancer and metastasis
Intratumour heterogeneity (ITH) fuels lung cancer evolution, which leads to immune evasion and resistance to therapy1. Here, using paired whole-exome and RNA sequencing data, we investigate intratumour transcriptomic diversity in 354 non-small cell lung cancer tumours from 347 out of the first 421 patients prospectively recruited into the TRACERx study2,3. Analyses of 947 tumour regions, representing both primary and metastatic disease, alongside 96 tumour-adjacent normal tissue samples implicate the transcriptome as a major source of phenotypic variation. Gene expression levels and ITH relate to patterns of positive and negative selection during tumour evolution. We observe frequent copy number-independent allele-specific expression that is linked to epigenomic dysfunction. Allele-specific expression can also result in genomic–transcriptomic parallel evolution, which converges on cancer gene disruption. We extract signatures of RNA single-base substitutions and link their aetiology to the activity of the RNA-editing enzymes ADAR and APOBEC3A, thereby revealing otherwise undetected ongoing APOBEC activity in tumours. Characterizing the transcriptomes of primary–metastatic tumour pairs, we combine multiple machine-learning approaches that leverage genomic and transcriptomic variables to link metastasis-seeding potential to the evolutionary context of mutations and increased proliferation within primary tumour regions. These results highlight the interplay between the genome and transcriptome in influencing ITH, lung cancer evolution and metastasis
Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA
Circulating tumour DNA (ctDNA) can be used to detect and profile residual tumour cells persisting after curative intent therapy1. The study of large patient cohorts incorporating longitudinal plasma sampling and extended follow-up is required to determine the role of ctDNA as a phylogenetic biomarker of relapse in early-stage non-small-cell lung cancer (NSCLC). Here we developed ctDNA methods tracking a median of 200 mutations identified in resected NSCLC tissue across 1,069 plasma samples collected from 197 patients enrolled in the TRACERx study2. A lack of preoperative ctDNA detection distinguished biologically indolent lung adenocarcinoma with good clinical outcome. Postoperative plasma analyses were interpreted within the context of standard-of-care radiological surveillance and administration of cytotoxic adjuvant therapy. Landmark analyses of plasma samples collected within 120 days after surgery revealed ctDNA detection in 25% of patients, including 49% of all patients who experienced clinical relapse; 3 to 6 monthly ctDNA surveillance identified impending disease relapse in an additional 20% of landmark-negative patients. We developed a bioinformatic tool (ECLIPSE) for non-invasive tracking of subclonal architecture at low ctDNA levels. ECLIPSE identified patients with polyclonal metastatic dissemination, which was associated with a poor clinical outcome. By measuring subclone cancer cell fractions in preoperative plasma, we found that subclones seeding future metastases were significantly more expanded compared with non-metastatic subclones. Our findings will support (neo)adjuvant trial advances and provide insights into the process of metastatic dissemination using low-ctDNA-level liquid biopsy
A local human Vδ1 T cell population is associated with survival in nonsmall-cell lung cancer
Murine tissues harbor signature γδ T cell compartments with profound yet differential impacts on carcinogenesis. Conversely, human tissue-resident γδ cells are less well defined. In the present study, we show that human lung tissues harbor a resident Vδ1 γδ T cell population. Moreover, we demonstrate that Vδ1 T cells with resident memory and effector memory phenotypes were enriched in lung tumors compared with nontumor lung tissues. Intratumoral Vδ1 T cells possessed stem-like features and were skewed toward cytolysis and helper T cell type 1 function, akin to intratumoral natural killer and CD8+ T cells considered beneficial to the patient. Indeed, ongoing remission post-surgery was significantly associated with the numbers of CD45RA−CD27− effector memory Vδ1 T cells in tumors and, most strikingly, with the numbers of CD103+ tissue-resident Vδ1 T cells in nonmalignant lung tissues. Our findings offer basic insights into human body surface immunology that collectively support integrating Vδ1 T cell biology into immunotherapeutic strategies for nonsmall cell lung cancer