Search CORE

10 research outputs found

Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer

Author: Beerenwinkel Niko
Beisel Christian
Borgsmüller Nico
Coelho Ricardo
Dondi Arthur
Heinzelmann-Schwarz Viola
Jacob Francis
Lischetti Ulrike
Singer Franziska
Tumor Profiler Consortium
Publication venue: Nature Publishing Group
Publication date: 27/11/2023
Field of study

Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Here, we perform long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increase the PacBio sequencing depth to 12,000 reads per cell. Our approach captures 152,000 isoforms, of which over 52,000 were not previously reported. Isoform-level analysis accounting for non-coding isoforms reveals 20% overestimation of protein-coding gene expression on average. We also detect cell type-specific isoform and poly-adenylation site usage in tumor and mesothelial cells, and find that mesothelial cells transition into cancer-associated fibroblasts in the metastasis, partly through the TGF-β/miR-29/Collagen axis. Furthermore, we identify gene fusions, including an experimentally validated IGF2BP2::TESPA1 fusion, which is misclassified as high TESPA1 expression in matched short-read data, and call mutations confirmed by targeted NGS cancer gene panel results. With these findings, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine

ZORA

Investigating cancer heterogeneity through single-cell DNA sequencing

Author: Borgsmüller Nico
Publication venue: ETH Zurich
Publication date: 01/01/2023
Field of study

Intratumor heterogeneity (ITH) describes the coexistence of cellular populations with distinct geno- and phenotypes within a tumor, posing a major obstacle to successful cancer treatment. The rapid progress in sequencing technologies over the last decades enabled studying ITH at the single-cell level, the highest possible resolution. Single-cell DNA sequencing (scDNA-seq) accesses the genomic information of individual tumor cells and their joint evolutionary history. This thesis presents three studies investigating genomic ITH through scDNA-seq, preceded by an introduction and concluded with a summary. Chapter 1 opens with the development of sequencing technologies, particularly DNA and single-cell sequencing, and provides an overview of cancer evolution and ITH. Chapter 2 presents demoTape, a computational demultiplexing method for targeted scDNA-seq data, leveraging the genomic distance between cells of jointly sequenced patients to separate them. On simulated data, demoTape outperforms competing methods in demultiplexing accuracy. Applied to a sample of three multiplexed lymphoma patients, it successfully demultiplexes the cells, leading to similar downstream analysis results as individually sequenced patients. DemoTape, therefore, allows the joint preparation and sequencing of multiple samples, saving costs and labor. Chapter 3 describes BnpC, a Bayesian non-parametric clustering method to identify cellular populations and their genotypes from scDNA-seq data. On simulated data, BnpC surpasses competing methods in accuracy and scalability. Applied to published scDNA-seq data, BnpC reproduces results that previously required additional experimental data or manual curation. The ability of BnpC to identify cellular populations and their genotypes holds great potential for personalized cancer therapies. Chapter 4 introduces the Poisson Tree test for detecting variable evolutionary rates among cell lineages, leveraging the phylogenetic information inherent to scDNA-seq data. When applied to 24 scDNA-seq datasets derived from different cancer types and healthy tissue, the Poisson Tree test rejects a constant rate in over 70% of cancer and in over 50% of healthy tissue datasets, suggesting that variations in the evolutionary rate are predominant in cancer but also frequently occur in healthy tissue. This thesis concludes with Chapter 5, discussing the presented studies in a greater context, reflecting on their limitations, and suggesting directions for future research

Repository for Publications and Research Data

Single-cell phylogenies reveal changes in the evolutionary rate within cancer and healthy tissues

Author: Beerenwinkel Niko
Borgsmüller Nico
Kuipers Jack
Posada David
Valecha Monica
Publication venue: Cell Press
Publication date: 13/09/2023
Field of study

Cell lineages accumulate somatic mutations during organismal development, potentially leading to pathological states. The rate of somatic evolution within a cell population can vary due to multiple factors, including selection, a change in the mutation rate, or differences in the microenvironment. Here, we developed a statistical test called the Poisson Tree (PT) test to detect varying evolutionary rates among cell lineages, leveraging the phylogenetic signal of single-cell DNA sequencing (scDNA-seq) data. We applied the PT test to 24 healthy and cancer samples, rejecting a constant evolutionary rate in 11 out of 15 cancer and five out of nine healthy scDNA-seq datasets. In six cancer datasets, we identified subclonal mutations in known driver genes that could explain the rate accelerations of particular cancer lineages. Our findings demonstrate the efficacy of scDNA-seq for studying somatic evolution and suggest that cell lineages often evolve at different rates within cancer and healthy tissues.ISSN:2666-979

Repository for Publications and Research Data

Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing for personalized oncology

Author: Beerenwinkel Niko
Beisel Christian
Borgsmüller Nico
Dondi Arthur
Heinzelmann-Schwarz Viola
Jacob Francis
Lischetti Ulrike
Singer Franziska
Tumor Profiler Consortium
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 15/12/2022
Field of study

Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Long-read single-cell RNA sequencing (scRNA-seq), capturing full-length transcripts, lacked the depth to provide this information so far. Here, we increased the PacBio sequencing depth to 12,000 reads per cell, leveraging multiple strategies, including artifact removal and transcript concatenation, and applied the technology to samples from three human ovarian cancer patients. Our approach captured 152,000 isoforms, of which over 52,000 were novel, detected cell type- and cell-specific isoform usage, and revealed differential isoform expression in tumor and mesothelial cells. Furthermore, we identified gene fusions, including a novel scDNA sequencing-validated IGF2BP2::TESPA1 fusion, which was misclassified as high TESPA1 expression in matched short-read data, and called somatic and germline mutations, confirming targeted NGS cancer gene panel results. With multiple new opportunities, especially for cancer biology, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine

Repository for Publications and Research Data

Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer

Author: Arthur Dondi
Christian Beisel
Francis Jacob
Franziska Singer
Nico Borgsmüller
Niko Beerenwinkel
Ricardo Coelho
Tumor Profiler Consortium
Ulrike Lischetti
Viola Heinzelmann-Schwarz
Publication venue: Nature Portfolio
Publication date: 01/11/2023
Field of study

Abstract Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Here, we perform long-read single-cell RNA sequencing (scRNA-seq) on clinical samples from three ovarian cancer patients presenting with omental metastasis and increase the PacBio sequencing depth to 12,000 reads per cell. Our approach captures 152,000 isoforms, of which over 52,000 were not previously reported. Isoform-level analysis accounting for non-coding isoforms reveals 20% overestimation of protein-coding gene expression on average. We also detect cell type-specific isoform and poly-adenylation site usage in tumor and mesothelial cells, and find that mesothelial cells transition into cancer-associated fibroblasts in the metastasis, partly through the TGF-β/miR-29/Collagen axis. Furthermore, we identify gene fusions, including an experimentally validated IGF2BP2::TESPA1 fusion, which is misclassified as high TESPA1 expression in matched short-read data, and call mutations confirmed by targeted NGS cancer gene panel results. With these findings, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine

Directory of Open Access Journals

SIEVE: joint inference of single-nucleotide variants and cell phylogeny from single-cell DNA sequencing data

Author: Alves Joao M.
Beerenwinkel Niko
Beerenwinkel Niko
Borgsmüller Nico
Kang Senbai
Kuipers Jack
Posada David
Prado-López Sonia
Szczurek Ewa
Valecha Monica
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/11/2022
Field of study

We present SIEVE, a statistical method for the joint inference of somatic variants and cell phylogeny under the finite-sites assumption from single-cell DNA sequencing. SIEVE leverages raw read counts for all nucleotides and corrects the acquisition bias of branch lengths. In our simulations, SIEVE outperforms other methods in phylogenetic reconstruction and variant calling accuracy, especially in the inference of homozygous variants. Applying SIEVE to three datasets, one for triple-negative breast (TNBC), and two for colorectal cancer (CRC), we find that double mutant genotypes are rare in CRC but unexpectedly frequent in the TNBC samples

Repository for Publications and Research Data

Machine learning-based classification to improve Gas Chromatography-Mass spectrometry data processing.

Author: Beule Dieter
Blanc Eric
Borgsmüller Nico
Durand Stéphanie
Giacomoni Franck
Gloaguen Yoann
Guitton Yann
Kirwan Jennifer
Le Bizec Bruno
Migné Carole
Opialla Tobias
Pujos-Guillot Estelle
Pétéra Mélanie
Royer Anne Lise
Sicard Emilie
Publication venue: HAL CCSD
Publication date: 22/01/2020
Field of study

Methodological & Technological developmentsIntroductionLack of reliable peak detection impedes automated analysis of large-scale gas chromatography-mass spectrometry (GCMS) metabolomics datasets. Performance and outcome of individual peak-picking algorithms can differ widely depending on both algorithmic approach and parameters, as well as data acquisition method. Therefore, comparing and contrasting between algorithms is difficult.Technological and methodological innovationWe present part of the work published in [1] and implemented in our workflow for improved peak picking (WiPP),focusing on the use of machine learning-based classification to optimize and improve different steps of the common GC-MS metabolomics data processing workflow. Our approach evaluates the quality of detected peaks using a machine learning based classification scheme based on seven peak classes. The quality information returned by the classifier for each individual peak is merged with results from different peak detection algorithms to create one final high-quality peak set for immediate down-stream analysis.Results and impactWe benchmarked our workflow to standard compound mixes and a complex biological dataset, demonstrating that peak detection is improved. Furthermore, the approach can provide an impartial performance comparison of different peak picking algorithms. We also discuss the applicability of the approach to liquid chromatography-mass spectrometry data.References[1] Gloaguen, Y.; Borgsmüller, N. et al. WiPP: Workflow for Improved Peak Picking for Gas Chromatography-MassSpectrometry (GC-MS) Data. Metabolites 2019, 9, 171

HAL Clermont Université

ProdInra

V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation

Author: Batavia Aashil A.
Beerenwinkel Niko
Borgsmüller Nico
Carrara Matteo
Chen Chaoran
Dondi Arthur
Dragan Monica
Dreifuss David
du Plessis Louis
Fuhrmann Lara
Icer Baykal Pelin Burcak
Jablonski Kim Philipp
John Anika
Langer Benjamin
Okoniewski Michal
Schmitt Uwe
Singer Franziska
Stadler Tanja
Topolsky Ivan
Publication venue: Cold Spring Harbor Laboratory
Publication date: 16/10/2023
Field of study

The large amount and diversity of viral genomic datasets generated by next-generation sequencing technologies poses a set of challenges for computational data analysis workflows, including rigorous quality control, adaptation to higher sample coverage, and tailored steps for specific applications. Here, we present V-pipe 3.0, a computational pipeline designed for analyzing next-generation sequencing data of short viral genomes. It is developed to enable reproducible, scalable, adaptable, and transparent inference of genetic diversity of viral samples. By presenting two large-scale data analysis projects, we demonstrate the effectiveness of V-pipe 3.0 in supporting sustainable viral genomic data science

Repository for Publications and Research Data

Within-patient genetic diversity of SARS-CoV-2

Author: Batavia Aashil A.
Bayer Fritz
Beckmann Christiane
Beerenwinkel Niko
Beisel Christian
Borgsmüller Nico
Burcklen Elodie
Capece Vincenzo
Dondi Arthur
Drăgan Monica-Andreea
Ferreira Pedro
Jablonski Kim Philipp
Jahn Katharina
Kobel Olivier
Kuipers Jack
Lamberti Lisa
Nadeau Sarah Ann
Nissen Ina
Noppen Christoph
Pirkl Martin
Posada Cespedes Susana
Redondo Maurice
Santacroce Natascha
Santamaria de Souza Noemie
Schär Tobias
Seidel Sophie
Stadler Tanja
Topolsky Ivan
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 12/10/2020
Field of study

SARS-CoV-2, the virus responsible for the current COVID-19 pandemic, is evolving into different genetic variants by accumulating mutations as it spreads globally. In addition to this diversity of consensus genomes across patients, RNA viruses can also display genetic diversity within individual hosts, and co-existing viral variants may affect disease progression and the success of medical interventions. To systematically examine the intra-patient genetic diversity of SARS-CoV-2, we processed a large cohort of 3939 publicly-available deeply sequenced genomes with specialised bioinformatics software, along with 749 recently sequenced samples from Switzerland. We found that the distribution of diversity across patients and across genomic loci is very unbalanced with a minority of hosts and positions accounting for much of the diversity. For example, the D614G variant in the Spike gene, which is present in the consensus sequences of 67.4% of patients, is also highly diverse within hosts, with 29.7% of the public cohort being affected by this coexistence and exhibiting different variants. We also investigated the impact of several technical and epidemiological parameters on genetic heterogeneity and found that age, which is known to be correlated with poor disease outcomes, is a significant predictor of viral genetic diversity

Repository for Publications and Research Data

Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing in ovarian cancer

Repository for Publications and Research Data