SCGclust: Single-Cell Graph Clustering Using Graph Autoencoders That Integrate SNVs and CNAs

Abstract

Intra-tumor heterogeneity (ITH) is a compounding factor for cancer prognoses and treatment. Single-cell DNA sequencing (scDNA-seq) provides cellular resolution of the variations in a cell and has been widely used to study cancer progression and the responses to drugs and treatments. While low-coverage scDNA-seq technologies typically provide a large number of cells, accurate cell clustering is essential for effectively characterizing the ITH. The existing cell clustering methods are typically based on either single-nucleotide variations (SNV) or copy number alterations (CNA), without leveraging both signals together. Since both SNVs and CNAs are indicative of cell subclonality, in this paper, we designed a robust cell-clustering tool that integrates both signals using a graph autoencoder. Our model co-trains the graph autoencoder and a graph convolutional network (GCN) to guarantee meaningful clustering results and to prevent all cells from collapsing into a single cluster. Given the low-dimensional embedding generated by the autoencoder, we adopted a Gaussian mixture model (GMM) to further cluster the cells. We evaluated our method on eight simulated datasets and a real cancer sample. Our results demonstrate that our method consistently achieved higher V-measure scores compared to SBMClone, an SNV-based method, and a K-means method that relies solely on CNA signals. These findings highlight the advantage of integrating both SNV and CNA signals within a graph autoencoder framework for accurate cell clustering

    Similar works

    Full text

    thumbnail-image

    University of Houston Institutional Repository (UHIR)

    redirect
    Last time updated on 14/01/2026

    Having an issue?

    Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.