Intra-tumor heterogeneity (ITH) is a compounding factor for cancer prognoses and treatment. Single-cell DNA sequencing (scDNA-seq) provides cellular resolution of the variations in a cell and has been widely used to study cancer progression and the responses to drugs and treatments. While low-coverage scDNA-seq technologies typically provide a large number of cells, accurate cell clustering is essential for effectively characterizing the ITH. The existing cell clustering methods are typically based on either single-nucleotide variations (SNV) or copy number alterations (CNA), without leveraging both signals together. Since both SNVs and CNAs are indicative of cell subclonality, in this paper, we designed a robust cell-clustering tool that integrates both signals using a graph autoencoder. Our model co-trains the graph autoencoder and a graph convolutional network (GCN) to guarantee meaningful clustering results and to prevent all cells from collapsing into a single cluster. Given the low-dimensional embedding generated by the autoencoder, we adopted a Gaussian mixture model (GMM) to further cluster the cells. We evaluated our method on eight simulated datasets and a real cancer sample. Our results demonstrate that our method consistently achieved higher V-measure scores compared to SBMClone, an SNV-based method, and a K-means method that relies solely on CNA signals. These findings highlight the advantage of integrating both SNV and CNA signals within a graph autoencoder framework for accurate cell clustering
Similar works
Full text
University of Houston Institutional Repository (UHIR)
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.